Discussion about this post

User's avatar
Daniel Popescu / ⧉ Pluralisk's avatar

It's super interesting how you break down the RLHF process so clearly, highlighting the absolutely critical role of human feedback in tackling inherent biases like gender in the training data. While Constitutional AI is a massive step forward, I always wonder how we trully ensure the 'H' component, even when enhanced by AI, maintains the depth of ethical reasoning needed as models scale, or if constant, vigilant human auditing is forever indispemsable to catch those subtle emergent biases.

Expand full comment

No posts