Sponsored content: created in partnership with the Duolingo English Test.
The Duolingo English Test (DET) provides participants with a single score from 10 to 160, determined by a sophisticated dual-methodology assessment that combines advanced AI technology and linguistic expertise.
When a university receives a student’s DET score, it consists of a single, composite figure between 10 and 160. Crucially, this final score is the product of a sophisticated, meticulously designed assessment framework that integrates advanced psychometric principles, cutting-edge technology and deep expertise in language evaluation and AI. Understanding the validity and reliability of the DET requires a closer look at how it translates the test-takers’ responses into a unified measure of English language proficiency.
The DET’s dual structure
The DET provides a holistic view of a candidate’s English-language skills by reporting two distinct subscore categories:
Individual subscores:
These scores reflect competence in the four fundamental language skills: reading, writing, listening and speaking. Performance on each task is evaluated based on criteria such as correctness, completeness and clarity, and these evaluations are aggregated to produce four individual subscores.
Integrated subscores:
Recognising that language skills are rarely used in isolation – particularly in academic and professional contexts – the DET also reports integrated scores. These scores demonstrate how skills function together and are crucial indicators of a candidate’s ability to navigate real-world academic tasks, such as contributing to a class discussion or composing an essay.
The integrated scores reported by the DET are:
- Literacy: Reading and writing combined
- Comprehension: Reading and listening combined
- Production: Writing and speaking combined
- Conversation: Listening and speaking combined
All final scores are reported on the 10–160 scale, using 5-point increments.
Scoring mechanisms: Rules-based and AI-driven evaluation
The scoring process depends heavily on the nature of the task undertaken by the participant.
For assessing receptive skills, the majority of reading and listening tasks are evaluated via structured, rules-based scoring that does not rely on machine learning models. Since these tasks typically have a single correct answer or a finite set of acceptable responses, evaluation is instant and consistent. Scoring may involve:
- Binary grading: e.g., correct/incorrect for vocabulary tasks
- Partial credit: e.g., for multi-step comprehension tasks
- Degrees of correctness: e.g., allowing for minor variations in dictation or highlight tasks
Conversely, when assessing productive skills (speaking and writing), open-ended tasks, such as photo descriptions or responding to a prompt, require more nuanced analysis. These responses are scored by sophisticated AI models that have been rigorously trained on thousands of expert-rated samples. These models evaluate performance across key linguistic and communicative dimensions:
- Content: Relevance and development of the response
- Coherence: Organisation and logical flow of the ideas
- Vocabulary and grammar: Range, appropriateness and accuracy of language
- Fluency and pronunciation: Clarity, naturalness and intelligibility of spoken language
The assessment is designed to mimic human expertise, analysing both what a test taker says and how they say it, with ongoing calibration ensuring the scores fairly and accurately reflect genuine English-language ability.
The role of computer adaptivity and fairness
A core distinction that enhances both the fairness and efficiency of the DET is its computer-adaptive design.
A computer-adaptive test is one that continuously changes and adjusts to the test taker. It begins with questions of moderate difficulty for all participants, quickly learning the individual’s level. For example, answering a question correctly leads to a slightly harder following question, while answering incorrectly leads to an easier one.
By making the test harder when a candidate does well and easier when they struggle, the system quickly and efficiently figures out their exact skill level. This is crucial because the final score reflects not just how many questions were answered correctly, but how hard the correctly answered questions were. Even if two students see completely different questions, their scores are still fair and can be compared directly because the system knows the difficulty level of every question delivered.
Why adaptive testing matters
Adaptive testing is not just a new way to test – it fundamentally improves the experience for the test taker and the validity of the results. Its benefits include:
- Shorter test time: Because the test finds your level quickly, it needs fewer questions to deliver an accurate score
- A better experience: You are challenged by questions that are close to your ability, avoiding the boredom of easy questions or the stress of impossibly hard ones
- More accurate scores: The test spends its time asking questions around your skill level, resulting in a more precise and reliable measure of your ability.
- Better security: Since every test taker gets a slightly different set of questions, it is much harder for anyone to cheat or share answers.
The adaptive format ensures your final score is a precise and fair measure of your unique English proficiency.
Reliable, valid and fair
To maintain the integrity of the assessment, the DET incorporates multiple layers of quality control. This includes flagging responses that are off-topic or made in bad faith, and conducting continuous fairness checks across various demographic groups – including by first-language speakers, gender and country – to guard against any bias.
Ultimately, the DET is engineered to deliver results that are valid and reliable, reflecting real-world academic-English ability and yielding similar scores upon retaking.
Comments