Efficient pretraining with token superposition