Discover a wealth of knowledge at Westonci.ca, where experts provide answers to your most pressing questions. Join our platform to connect with experts ready to provide accurate answers to your questions in various fields. Explore comprehensive solutions to your questions from knowledgeable professionals across various fields on our platform.

In natural language processing models like BERT, what does the "attention mask" and "pad token id" primarily contribute to?
A) Sentence segmentation
B) Named entity recognition
C) Masked language modeling
D) Sequence classification