Waymo Autonomous Driving Planning Interview: Behavior Prediction, Motion Planning, and Decision Making
2 years of autonomous driving planning experience, detailed review of Waymo's three technical interview rounds: Round 1 behavior prediction and deep learning, Round 2 motion planning and Lattice Planner, Round 3 decision system and project deep dive, with question summary and tips
Background
Let me start with my background. I worked as a planning algorithm engineer at another autonomous driving company for 2 years, mainly focusing on behavior prediction and motion planning. Honestly, Waymo has always been my dream company — their technical capabilities and deployment progress are top-tier in the autonomous driving industry. When I saw they were hiring planning algorithm engineers earlier this year, I applied without hesitation.
Before applying, I was actually pretty nervous. Although I had 2 years of planning experience, it was mainly on structured roads, and I didn't have much experience with complex scenarios. But then I thought, how would I know if I didn't try? So I spent two weeks reviewing behavior prediction, motion planning, and decision-making concepts, especially Lattice Planner and learning-based prediction methods — these are the key focus areas.
The entire interview process consisted of three technical rounds plus one HR round. Let me walk through each round in detail.
Interview Process Review
Round 1: Behavior Prediction + Deep Learning
My first interviewer was a young engineer from the prediction team — very friendly. He started with a self-introduction and then jumped right into technical questions.
He first asked how behavior prediction works in my current project. I explained our intent-based prediction framework: first classify intent (left turn, go straight, right turn), then generate trajectories based on the intent. He followed up with several questions:
1. What's the accuracy of intent classification? How serious are the consequences of misclassification?
I said the intent classification accuracy is around 85%. Misclassification can indeed lead to unsafe planned trajectories, so we have a safety check at the planning layer — if there's a conflict between predicted and planned trajectories, it triggers emergency obstacle avoidance. He nodded, seemingly satisfied with this answer.
2. Have you considered using end-to-end methods for prediction?
I said we tried using Social GAN for trajectory prediction. The results were decent, but the interpretability was too poor — we didn't dare use it in safety-critical scenarios. Later, we switched to a hybrid rule-based + learning approach, where rules handle high-certainty scenarios and learning handles high-uncertainty scenarios.
Then he moved on to deep learning fundamentals:
3. What are the advantages of Transformer in sequence prediction?
I said the main advantage is the ability to capture long-range dependencies, and it can be computed in parallel, making it much faster than LSTM. In behavior prediction, we use Transformer to encode the historical trajectories of surrounding vehicles, and the results were significantly better than LSTM.
4. How did you design the loss function for training the prediction model?
I said we used a multi-task loss, including cross-entropy loss for intent classification and ADE/FDE loss for trajectory regression, plus a diversity loss to encourage the model to generate diverse predicted trajectories. He asked how the diversity loss is specifically implemented — I said it's based on GMM, where each mode corresponds to a Gaussian distribution, using negative log-likelihood as the loss.
5. How do you handle cases where prediction results deviate significantly from actual behavior?
I said we have an online monitoring system. If prediction deviation exceeds a threshold, it triggers re-prediction, and the planning layer switches to a more conservative strategy. We also collect these cases and add them to the training data.
Round 1 lasted about an hour. The interviewer said my "fundamentals are solid" and told me to wait for the Round 2 notification.
Round 2: Motion Planning + Lattice Planner
Round 2 was with a senior engineer. He went straight into motion planning, and the atmosphere was noticeably more tense than Round 1.
1. Tell me about your understanding of Lattice Planner?
I said the core idea of Lattice Planner is to sample a set of candidate trajectories in the state space, then evaluate each trajectory using a cost function and select the one with the lowest cost. The key points are the sampling strategy and cost function design. Too dense sampling leads to high computational cost, while too sparse sampling might miss the optimal solution. The cost function needs to comprehensively consider comfort, safety, and traffic rule compliance.
2. What's the difference between Lattice Planner and RRT Star?
I said Lattice Planner searches in a discretized state space, while RRT Star randomly samples in continuous space. The advantage of Lattice Planner is controllable trajectory quality, but the disadvantage is that state space discretization might miss some good solutions. RRT Star is probabilistically complete, but trajectories may not be smooth enough and require post-processing.
3. How do you handle dynamic obstacles in real projects?
I said we use a spatiotemporal A* algorithm, adding time as a third dimension to the search space. For dynamic obstacles, we mark occupied regions in the spatiotemporal graph based on predicted trajectories and avoid these regions during search. He asked if the computational cost would be too high — I said it indeed would be, so we did some optimizations, like only doing spatiotemporal search within a certain range around obstacles and using static planning for distant areas.
4. Tell me about your understanding of the Frenet coordinate system? Why is it commonly used in autonomous driving planning?
I said the Frenet coordinate system is a curvilinear coordinate system established along a reference line, with the longitudinal direction represented by s (arc length along the reference line) and the lateral direction represented by d (distance from the reference line). It's commonly used in autonomous driving because roads are inherently curved, and using Cartesian coordinates to describe a vehicle's position on the road isn't very intuitive. The Frenet coordinate system naturally suits describing longitudinal and lateral vehicle movements on a lane, and it can decouple the 3D planning problem into two 1D problems — longitudinal and lateral.
5. How do you coordinate lateral and longitudinal planning?
I said we use an alternating optimization strategy: first fix the longitudinal trajectory and optimize lateral, then fix lateral and optimize longitudinal, iterating until convergence. Also, at high speeds, the coupling between lateral and longitudinal is relatively weak, so they can be planned independently. But at low speeds, the coupling is stronger, requiring joint optimization.
6. How do you ensure that planned trajectories are dynamically feasible?
I said we impose constraints at the sampling stage, only sampling trajectories that satisfy vehicle kinematic constraints (maximum curvature, maximum acceleration, etc.), rather than sampling first and checking later. This way, although the sampling space is smaller, we can guarantee that every candidate trajectory is executable.
Round 2 lasted about 1 hour and 15 minutes. The interviewer asked very detailed questions, and I didn't answer several of them well, especially the optimization details of spatiotemporal A* — my answers were quite vague. I was a bit worried after this round.
Round 3: Decision System + Project Deep Dive
Round 3 was with the tech lead of the planning team — very imposing presence. He first asked me to describe the most challenging project I had worked on.
I talked about our complex intersection planning project, including how we handle multi-vehicle interactions and pedestrians suddenly entering the road. He asked several in-depth questions:
1. How is your decision system architected?
I said we use a hierarchical decision architecture: the top level is navigation decision (which route to take), the middle level is behavioral decision (car-following, lane change, yielding, etc.), and the bottom level is motion planning. The behavioral decision layer uses a finite state machine + rules, with states including car-following, lane change preparation, lane change execution, emergency stop, etc.
2. How do you solve the state explosion problem of finite state machines?
I said we do encounter state explosion problems, especially in complex scenarios. Our approach is to layer the states — high-level states handle abstract behaviors, and low-level states handle specific actions. We're also exploring using reinforcement learning to replace some rules, but for safety-critical scenarios, we still primarily use rules.
3. What happens if the decision and planning results conflict?
I said we have an arbitration mechanism: if the planning layer finds that the behavior given by the decision layer is infeasible (e.g., lane change but there's a vehicle alongside), it feeds back to the decision layer for re-decision. If there's not enough time, the planning layer executes a fallback strategy (e.g., maintain current lane and follow the car ahead).
4. How do you evaluate the overall performance of the planning system?
I said we have an evaluation system including simulation evaluation and real-vehicle evaluation. Simulation evaluation mainly looks at pass rate, comfort metrics (acceleration, jerk), and safety metrics (collision rate, TTC). Real-vehicle evaluation mainly looks at takeover rate and manual intervention count. He asked about the consistency between simulation and real vehicle — I said about 70%, mainly because pedestrian behavior in simulation isn't realistic enough.
5. What do you think is the biggest bottleneck of current planning systems?
I said I think the biggest bottleneck is the coupling between prediction and planning. Currently, prediction is done first, then planning — it's sequential. But in reality, prediction and planning influence each other: my planning behavior affects other vehicles' behavior, and changes in other vehicles' behavior in turn affect my planning. Without solving this problem, planning systems will never reach the level of human drivers.
Round 3 lasted over an hour. The interviewer clearly had high requirements for technical depth. At the end, he asked if I had any questions. I asked about Waymo's latest progress in behavior prediction, and he briefly mentioned they're also exploring joint optimization of prediction and planning.
Key Questions Summary
Behavior Prediction:
1. How to design an intent-based prediction framework? How to improve intent classification accuracy?
2. Pros and cons of end-to-end prediction methods? How to ensure interpretability in safety scenarios?
3. Advantages and specific applications of Transformer in sequence prediction?
4. Loss function design for prediction models (multi-task loss, diversity loss)?
5. How to handle large prediction deviations online?
Motion Planning:
6. Core concepts and key parameters of Lattice Planner?
7. Differences and applicable scenarios of Lattice Planner vs RRT Star?
8. Handling dynamic obstacles in spatiotemporal graphs?
9. Understanding and advantages of the Frenet coordinate system?
10. Coordination strategies for lateral and longitudinal planning?
11. How to ensure dynamic feasibility of planned trajectories?
Decision System:
12. Design of hierarchical decision architecture?
13. State explosion problem in finite state machines?
14. Conflict resolution mechanism between decision and planning?
15. Evaluation system for planning systems?
16. Coupling problem between prediction and planning?
Tips and Advice
1. Build a solid foundation: Waymo's interviews place great emphasis on fundamentals. Whether it's deep learning or planning algorithms, they start from basic concepts and gradually go deeper. If your foundation isn't solid, you'll easily expose gaps during follow-up questions.
2. Be able to explain your project experience clearly: Interviewers will ask many details about your project, including why you did it this way, whether you considered other approaches, what problems you encountered and how you solved them. You must be very familiar with your own projects — you can't just give a general overview.
3. Stay updated on industry developments: The interview will touch on frontier directions like joint optimization of prediction and planning, end-to-end planning, etc. If you can share your understanding and thoughts, it'll earn you extra points.
4. Don't be afraid to say you don't know: There were several questions I wasn't sure about, and I directly said "I'm not very familiar with this." The interviewer didn't make things difficult — instead, they guided me from a different angle. Being caught pretending to know is the most embarrassing situation.
5. Preparation time recommendation: 2-3 weeks: If you have about 2 years of relevant experience, 2-3 weeks of focused preparation should be sufficient. Focus on reviewing planning algorithms (Lattice Planner, A*, RRT), behavior prediction (intent classification, trajectory generation), and deep learning fundamentals (Transformer, Loss design).
FAQ
Q: How difficult is Waymo's interview?
A: Honestly, it's quite difficult, especially Rounds 2 and 3. Round 1 focuses on fundamentals, Round 2 on algorithm depth, and Round 3 on system design and project experience. Overall, I'd rate the difficulty as top-tier among autonomous driving companies.
Q: What's the interviewers' style like?
A: The three interviewers had quite different styles. Round 1 was relaxed, like a conversation; Round 2 was serious with dense questions; Round 3 had the most pressure — they kept asking follow-up questions until you couldn't answer.
Q: Do I need to practice LeetCode?
A: There were no LeetCode problems, but they asked about algorithm concepts, like how to design the heuristic function for A* and how to prune the search space. I'd recommend understanding the classic planning algorithms thoroughly.
Q: How's the compensation?
A: This varies by person. From what I know, the base salary for planning algorithm positions is roughly in the $150K-$220K range, depending on level and negotiation.
Q: How long does it take to get interview results?
A: I received the Round 2 notification 3 days after Round 1, Round 3 notification 5 days after Round 2, and the offer 1 week after Round 3.