2lspe: 2d learnable sinusoidal positional encoding using transformer for scene text recognition