This repository provides a comprehensive collection of Turkish Sentiment Analysis Datasets from 2012 to 2025, covering diverse domains such as social media, e-commerce, news, political commentary, and more. It includes access links for publicly available datasets, contact information for restricted datasets, and detailed reuse references. Additionally, the repository provides a Python script for sentiment analysis using pre-trained transformer models.
To build this repository, we systematically reviewed academic studies indexed in Scopus and other scholarly databases. The search focused on publications that applied sentiment analysis using Turkish-language data or introduced sentiment-labeled Turkish datasets. Inclusion criteria required that papers either:
- Used classification models on labeled Turkish sentiment datasets and reported results, or
- Contributed novel Turkish datasets suitable for future modeling.
- Query:
'sentiment analysis' AND 'Turkish dataset'
- Databases: Scopus
- Document Types: Conference papers, journal articles, book chapters
- Date Range: 2012–2025
The final collection includes 78 studies and over 80 datasets. Among these:
- More than 30 datasets are publicly available and linked,
- Others are listed with author contacts for access,
- Reused datasets are referenced with their original sources.
The repository provides:
- Links to publicly available datasets
- Contact Information for datasets that are not openly accessible
- Reuse Citations for datasets previously published or used in multiple studies
- Clone this repository:
git clone https://github.com/sevvalckc/Turkish-SAD.git cd Turkish-SAD
- Install required libraries: pip install -r requirements.txt
- Ensure your datasets (e.g., data1.csv, data2.csv) are placed in the same directory as the script.
- Run the script: python sentiment_analysis.py
- The script will output sentiment analysis results to CSV files for each model.
The script requires the following Python libraries and versions:
- Pandas version: 2.2.2
- PyTorch version: 2.5.1+cu121
- Transformers version: 4.46.2
- Scipy version: 1.13.1
To install all required libraries, run: pip install -r requirements.txt sv) for each model.
TurkishBERTweet: VRLLab/TurkishBERTweet-Lora-SA TSAM: emre/turkish-sentiment-analysis BERTurk: akoksal/bounti XLM-T: cardiffnlp/twitter-xlm-roberta-base-sentiment
Enabling TPU and High RAM
To use this script on Google Colab with TPU and high RAM, follow these steps:
- Open Google Colab: Go to Google Colab.
- Upload the script: Upload sentiment_analysis.py and your datasets (data1.csv, data2.csv) to Colab.
Enable TPU:
Go to Runtime > Change runtime type. Select TPU from the Hardware accelerator dropdown menu. Enable High RAM:
Go to Runtime > Manage sessions. Click on the current session. Select High-RAM from the options available.