Welcome to Westonci.ca, where your questions are met with accurate answers from a community of experts and enthusiasts. Get immediate and reliable solutions to your questions from a community of experienced experts on our Q&A platform. Connect with a community of professionals ready to provide precise solutions to your questions quickly and accurately.

In natural language processing models like BERT, what does the "attention mask" and "pad token id" primarily contribute to?
A) Sentence segmentation
B) Named entity recognition
C) Masked language modeling
D) Sequence classification