For my case I have decided to implement my own car interface and ability. That's the only way to get a consistent state with two players and a driver and passenger option. Everything works fine except for one thing that also affects the standard drive ability and IDriveSource interface.
I start with two players, both do not interact with the car before both players have entered the room. After that any of the players can interact with the car without any issues. Both can be driver or passenger. Ownership and control is changed accordingly. Everything is fine.
Now, if the first player enters the room and enters the car before the second player enters the room, the state is not synchronized at all. The issue is that the state synchronization depends on processing the ability including the animations from start to end. But when the first player is already in the car, the ability is not executed on the second client, as the position of the player in the car is automatically synchronized. As a result, the first player on the second client has no IDriveSource assigned, the colliders are not ignored / disabled, the animation state is not correct, and the character is not a child of the car.
For my prototype I can live with it, as I can avoid the situation. But for production this is a real issue, and I don't see any easy fix for that.