Emoji Category and Position Prediction in Text Passages
EMOJI CATEGORY AND POSITION PREDICTION IN TEXT PASSAGES (2022)
- Created a novel dataset by systematically scraping and processing emoji information with character and word-level indexing from approximately 350K tweets.
- Implemented a Bi-LSTM neural network architecture with pre-trained GloVe embeddings to predict both emoji type and position within text passages. To address semantic similarity among emojis, the system employed emoji2vec clustering (visualized on the left) and used these clusters as target labels. The model achieved 62% accuracy in emoji prediction and 78% accuracy in position prediction, with sample predictions demonstrated on the right.
- This project was completed as part of CS 7650 (Natural Language Processing). The complete implementation and detailed analysis are available in the repository and research report.