🥇 IL-TUR Leaderboard

Legal systems worldwide are inundated with exponential growth in cases and documents. There is an imminent need to develop NLP and ML techniques for automatically processing and understanding legal documents to streamline the legal system. However, evaluating and comparing various NLP models designed specifically for the legal domain is challenging. This project addresses this challenge by proposing IL-TUR: Benchmark for Indian Legal Text Understanding and Reasoning. IL-TUR contains monolingual (English, Hindi) and multi-lingual (9 Indian languages) domain-specific tasks that address different aspects of the legal system from the point of view of understanding and reasoning over Indian legal documents. We present baseline models (including LLM-based) for each task, outlining the gap between models and the ground truth. We will release a public leaderboard where the research community can upload and compare legal text understanding systems on various metrics, thus fostering research in the legal domain. Read more at https://exploration-lab.github.io/IL-TUR/.

Select Tasks
L-NER
RR
CJPE
BAIL
LSI
PCR
SUMM
L-MT

DISCLAIMER

  • It can take upto 20 MINUTES for the submission to be evaluated! Please be patient, and do not close or refresh the tab.
  • ROUGE-L metric for Summarization (SUMM) is not available at the moment due to computational constraints.

Quick Links

Loading the Dataset

To load the dataset, use the following code:

from datasets import load_dataset
dataset = load_dataset("Exploration-Lab/IL-TUR", "<task_name>", revision="script")

Creating a submission file

A submission file should exactly follow the format as "IL_TUR_eval_submission_dummy.json". Each key in the file corresponds to each task. You can submit predictions for one, multiple, or all tasks. However, for any task you submit, you should have predictions corresponding to every instance in the test set (keys in the submission file). In most cases, the format of the predictions is similar to that of the gold-standard labels in the dataset.