HNNewShowAskJobs
Built with Tanstack Start
Theoretical Analysis of Positional Encodings in Transformer Models(arxiv.org)
36 points by PaulHoule 4 days ago | 4 comments
  • semiinfinitely4 days ago

    Kinda disappointing that rope- the most common pe- is given about one sentence in this work and omitted from the analysis.

    • gsf_emergency_24 days ago |parent

      Maybe it's because ropes by themselves do nothing for the model capacity?

      • semiinfinitely3 days ago |parent

        Neither does alibi

        • gsf_emergency_221 hours ago |parent

          Alibi doesn't have unnecessary "periodic" constraints like ropes (or sinusoidal, mentioned in the paper, equivalent to ropes)

          So it sounds like a better starting point if you are looking for unlimited context