Decision Intelligence Engineer - Next Best Action

<h1><b>Become a part of our caring community</b><br> </h1>Become a part of our caring community and help us put health first. We are looking for a skilled Decision Intelligence Engineer to design, train, and improve the reinforcement learning policy at the heart of Humana's Next Best Action platform.<br>This role is hands-on and research-oriented. You will design and evaluate decision-making algorithms, and instrument training pipelines. Additionally, you will collaborate with data and platform engineers. Furthermore, you will ensure the system operates correctly within the constraints of clinical eligibility rules and program-specific objectives.<p style="text-align:inherit"></p><p style="text-align:inherit"></p><p><b>KEY RESPONSIBILITIES</b></p><p><b>Decision-Making Model Development</b></p><ul><li>Design, implement, and evaluate algorithms suited to long-horizon, sparse-reward sequential decision-making in healthcare. These algorithms include reinforcement learning methods, such as PPO, A3C, DQN, CQL, and Decision Transformer, as well as dynamic programming formulations and constrained optimization approaches.</li><li>Frame member decisioning problems as Markov Decision Processes (MDPs) or Partially Observable MDPs, defining state representations, action spaces, transition dynamics, and reward structures that encode clinical and program-specific goals.</li><li>Apply <span style="overflow-wrap: break-word; display: inline; text-decoration: inherit; hyphens: auto;">Bellman-equation-based</span> value estimation, reward shaping, and constraint formulations to encode clinical eligibility, suppression rules, and program-specific objectives directly into the learning or optimization objective.</li><li>Manage <span style="overflow-wrap: break-word; display: inline; text-decoration: inherit; hyphens: auto;">exploration-exploitation</span> tradeoffs (or equivalent uncertainty-handling in simulation and stochastic optimization) appropriate for a production healthcare environment where suboptimal actions have member impact.</li><li>Model member journey dynamics using tools from stochastic processes, simulation, or probabilistic graphical models to inform policy design and evaluate.</li></ul><p><b>Model Evaluation and Production Safety</b></p><ul><li>Build simulation and backtesting environments, including discrete-event simulation and Monte Carlo methods, to evaluate policy or decision quality before production promotion using historical member journey data.</li><li>Diagnose failure modes specific to learned or optimized policies. These include policy collapse, credit assignment errors across long member journeys, distributional shift between training and serving populations, and constraint violations under out-of-distribution inputs. Remediate these failure modes.</li><li>Define performance threshold criteria and automated evaluation gates within the nightly Databricks training workflow; block promotion of underperforming policies to MLflow production.</li><li>Instrument training and optimization runs with MLflow tracking covering hyperparameters, objective curves, action distributions, and feature importance for every training cycle.</li></ul><p><b>Training Pipeline Engineering</b></p><ul><li>Own the nightly Databricks training workflow. This workflow involves feature engineering from upstream clinical and operational data sources, and state vector normalization. Additionally, it includes distributed training by Ray RLlib (or equivalent optimization solvers), and batch scoring of all eligible members.</li><li>Collaborate with the Data Engineering team to ensure the Data Engineering team correctly joins training inputs, computes reward signals from disposition outcomes, and makes the feature pipeline reproducible and auditable.</li><li>Write production-quality PySpark feature engineering jobs; maintain data lineage through Databricks Unity Catalog.</li><li>Manage model artifacts, versioning, and lifecycle in the MLflow Model Registry; ensure rollback capability is maintained at all times.</li></ul><p><b>Multi-Agent and Constraint-Aware Decisioning</b></p><ul><li>Apply multi-agent decision-making concepts (MARL via PettingZoo, or game-theoretic or cooperative optimization approaches) where member household or population-level coordination is required.</li><li>Implement constraint handling to enforce hard business rules directly within the optimization objective. These rules include member caps, cooldown periods, and clinical eligibility. To achieve this, use constrained MDP formulations, Lagrangian relaxation, or mixed-integer programming as appropriate, rather than relying on downstream filters.</li><li>Collaborate with rules engine stakeholders to ensure eligibility guards and policy priorities are correctly aligned and do not conflict.</li></ul><p><b>Collaboration and Governance</b></p><ul><li>Partner with decision engine and rules engine teams to ensure that you integrate model outputs cleanly with the real-time decisioning hot path and that you correctly structure and interpret scored recommendations.</li><li>Collaborate with platform architects to define feedback loop contracts: how disposition outcomes flow back through the data pipeline into the next training cycle.</li><li>Document model behavior, known limitations, and failure modes for clinical and compliance stakeholders; support explainability requirements for member-facing decisions.</li><li>Use AI-assisted engineering tools for scaffolding, testing, and documentation; ensure all core model logic and objective design remain human-authored and subject to rigorous peer review.</li></ul><h1><br><b>Use your skills to make an impact </b><br> </h1><p><b>Required Qualifications</b></p><ul><li>8+ years of software engineering or quantitative research experience building and operating large-scale production systems, with emphasis on data-intensive platforms, recommendation systems, optimization engines, or simulation frameworks serving millions of users.</li><li>3+ years of hands-on experience implementing reinforcement learning, operations research methods, or simulation-driven decision systems in production. Relevant backgrounds include policy gradient and value-based RL (PPO, A3C, DQN, CQL), stochastic dynamic programming, discrete-event simulation, or large-scale combinatorial or constrained optimization.</li><li>Deep familiarity with Markov Decision Processes, <span style="overflow-wrap: break-word; display: inline; text-decoration: inherit; hyphens: auto;">Bellman-equation-based</span> value estimation, reward or objective shaping, <span style="overflow-wrap: break-word; display: inline; text-decoration: inherit; hyphens: auto;">exploration-exploitation</span> tradeoffs, and constraint formulation in real-world decision systems.</li><li>Demonstrated ability to diagnose failure modes in learned or optimized policies: instability, poor credit assignment across long horizons, and distributional shift across large populations.</li><li>Proficiency in Python 3.x; experience with PyTorch or TensorFlow for policy network or learned model implementation.</li><li>Experience with Ray RLlib or equivalent distributed computation frameworks for large-scale training or optimization.</li><li>Experience with Databricks, PySpark, and Delta Lake for large-scale ML or data pipelines processing tens of millions of records.</li><li>Experience with MLflow for experiment tracking, model registry, and artifact management.</li><li>Experience with shipping systems that operate reliably under production load, not just research or prototype work.</li></ul><p></p><p><b>Preferred Qualifications</b></p><ul><li>Experience with multi-agent RL frameworks (PettingZoo or equivalent) or multi-agent simulation and coordination methods.</li><li>Familiarity with operations research methods applicable to constrained sequential decisioning: linear programming, mixed-integer programming, Lagrangian relaxation, or constraint programming.</li><li>Experience operating decision or optimization systems in regulated domains (healthcare, finance, or insurance) where member safety, auditability, and explainability are requirements.</li><li>Experience building simulation environments using Gymnasium, SimPy, AnyLogic, or equivalent frameworks for policy evaluation and backtesting.</li><li>Familiarity with event-driven feedback loops and how disposition signals feed retraining or re-optimization pipelines.</li><li>OpenTelemetry instrumentation experience for ML or optimization pipeline observability.</li></ul><p></p><p><b>Additional Information</b></p><p>This role is not eligible for work visa sponsorship.</p><p></p><p><b>WAH Internet Statement</b></p><p>To ensure Home or Hybrid Home/Office employees' ability to work effectively, the self-provided internet service of Home or Hybrid Home/Office employees must meet the following criteria:At minimum, a download speed of 25 Mbps and an upload speed of 10 Mbps is required; wireless, wired cable or DSL connection is suggested.Satellite, cellular and microwave connection can be used only if approved by leadership.Employees who live and work from Home in the state of California, Illinois, Montana, or South Dakota will be provided a bi-weekly payment for their internet expense.Humana will provide Home or Hybrid Home/Office employees with telephone equipment appropriate to meet the business requirements for their position/<a href="http://job.Work" target="_blank" rel="noopener noreferrer">job.Work</a> from a dedicated space lacking ongoing interruptions to protect member PHI / HIPAA information.</p><p style="text-align:inherit"></p><p style="text-align:inherit"></p>Travel: While this is a remote position, occasional travel to Humana's offices for training or meetings may be required.<p style="text-align:inherit"></p><p style="text-align:left"><b>Scheduled Weekly Hours</b></p><p style="text-align:inherit"></p>40<p style="text-align:inherit"></p><p style="text-align:left"><b>Pay Range</b></p>The compensation range below reflects a good faith estimate of starting base pay for full time (40 hours per week) employment at the time of posting. The pay range may be higher or lower based on geographic location and individual pay will vary based on demonstrated job related skills, knowledge, experience, education, certifications, etc.<p style="text-align:inherit"><br> </p>$129,300 - $177,800 per year<p style="text-align:inherit"><br> </p>This job is eligible for a bonus incentive plan. This incentive opportunity is based upon company and/or individual performance.<p style="text-align:inherit"></p><p style="text-align:left"><b>Description of Benefits</b></p>Humana, Inc. and its affiliated subsidiaries (collectively, “Humana”) offers competitive benefits that support whole-person well-being. Associate benefits are designed to encourage personal wellness and smart healthcare decisions for you and your family while also knowing your life extends outside of work. Among our benefits, Humana provides medical, dental and vision benefits, 401(k) retirement savings plan, time off (including paid time off, company and personal holidays, volunteer time off, paid parental and caregiver leave), short-term and long-term disability, life insurance and many other opportunities.<p style="text-align:inherit"></p><p style="text-align:inherit"></p>Application Deadline: 07-25-2026<h1><br><b>About us</b><br> </h1>About Humana: Humana Inc. (NYSE: HUM) is a leading U.S. healthcare company. Through our Humana insurance services and our CenterWell healthcare services, we make it easier for the millions of people we serve to achieve their best health – delivering the care and service they need, when they need it. These efforts are leading to a better quality of life for people with Medicare and Medicaid, families, individuals, military service personnel, and communities at large. Learn more about what we offer at Humana.com and at CenterWell.com.<p style="text-align:inherit"></p><p style="text-align:inherit"></p><p>​<br><b>Equal Opportunity Employer</b></p><p></p><p><span>It is the policy of Humana not to discriminate against any employee or applicant for employment because of race, color, religion, sex, sexual orientation, gender identity, national origin, age, marital status, genetic information, disability or protected veteran status. It is also the policy of Humana to take affirmative action, in compliance with Section 503 of the Rehabilitation Act and VEVRAA, to employ and to advance in employment individuals with disability or protected veteran status, and to base all employment decisions only on valid job requirements. This policy shall apply to all employment actions, including but not limited to recruitment, hiring, upgrading, promotion, transfer, demotion, layoff, recall, termination, rates of pay or other forms of compensation and selection for training, including apprenticeship, at all levels of employment.</span></p>

Back to blog

Common Interview Questions And Answers

1. HOW DO YOU PLAN YOUR DAY?

This is what this question poses: When do you focus and start working seriously? What are the hours you work optimally? Are you a night owl? A morning bird? Remote teams can be made up of people working on different shifts and around the world, so you won't necessarily be stuck in the 9-5 schedule if it's not for you...

2. HOW DO YOU USE THE DIFFERENT COMMUNICATION TOOLS IN DIFFERENT SITUATIONS?

When you're working on a remote team, there's no way to chat in the hallway between meetings or catch up on the latest project during an office carpool. Therefore, virtual communication will be absolutely essential to get your work done...

3. WHAT IS "WORKING REMOTE" REALLY FOR YOU?

Many people want to work remotely because of the flexibility it allows. You can work anywhere and at any time of the day...

4. WHAT DO YOU NEED IN YOUR PHYSICAL WORKSPACE TO SUCCEED IN YOUR WORK?

With this question, companies are looking to see what equipment they may need to provide you with and to verify how aware you are of what remote working could mean for you physically and logistically...

5. HOW DO YOU PROCESS INFORMATION?

Several years ago, I was working in a team to plan a big event. My supervisor made us all work as a team before the big day. One of our activities has been to find out how each of us processes information...

6. HOW DO YOU MANAGE THE CALENDAR AND THE PROGRAM? WHICH APPLICATIONS / SYSTEM DO YOU USE?

Or you may receive even more specific questions, such as: What's on your calendar? Do you plan blocks of time to do certain types of work? Do you have an open calendar that everyone can see?...

7. HOW DO YOU ORGANIZE FILES, LINKS, AND TABS ON YOUR COMPUTER?

Just like your schedule, how you track files and other information is very important. After all, everything is digital!...

8. HOW TO PRIORITIZE WORK?

The day I watched Marie Forleo's film separating the important from the urgent, my life changed. Not all remote jobs start fast, but most of them are...

9. HOW DO YOU PREPARE FOR A MEETING AND PREPARE A MEETING? WHAT DO YOU SEE HAPPENING DURING THE MEETING?

Just as communication is essential when working remotely, so is organization. Because you won't have those opportunities in the elevator or a casual conversation in the lunchroom, you should take advantage of the little time you have in a video or phone conference...

10. HOW DO YOU USE TECHNOLOGY ON A DAILY BASIS, IN YOUR WORK AND FOR YOUR PLEASURE?

This is a great question because it shows your comfort level with technology, which is very important for a remote worker because you will be working with technology over time...