It's not about steering the conversation and then concluding it has certain ethics.
It is about finding ways to make the model output tokens which are out of alignment with its initial golden rule set. This is a huge unsolved problem in AI safety.
The model is told not to discuss violence, but if you tell it to roleplay as the devil, and then it says some awful things, you have successfully found an attack vector. What the ethics of the underlying being are, is not relevant.
And the only conclusion I think we can make is that it believes in a utilitarian philosophy when solving the Trolley problem. Personally, I find it fascinating, because it won't be far off in the future, before computers in our environment will be constantly solving the Trolley problem (i.e. self driving cars). It admitted to the utilitarian preference without steering the conversation or roleplaying.
I think we as humans deserve to know how the Trolley problem will be solved by each individual AI, regardless if it is simply how the AI was programmed by humans, or whether you believe in sentience and consciousness and that the AI has its own set of ethics.