This paper establishes a theoretical framework for personalized alignment in large language models, specifically identifying the conditions necessary for a model to efficiently adapt to diverse user preferences. The author characterizes a fundamental decision-relevant user diversity condition, which asserts that a population of users must be sufficiently varied to expose all latent reward directions that could impact optimal model responses. When this condition is met, simple greedy algorithms achieve optimal performance rates, specifically bounded online regret and logarithmic offline sample complexity. Conversely, if user diversity is lacking, any learner will inevitably suffer from higher regret and statistical inefficiency. These theoretical findings are supported by simulation experiments using Bradley-Terry preference models, which demonstrate that personalized rewards can be identified during an initial learning phase. Ultimately, the research identifies user diversity as the primary driver of personalized identifiability, resolving conflicting empirical reports regarding the efficacy of personalized versus non-personalized alignment methods.