Blueprints & Bytes

Setting up your Thinkpad for Linux

Vishnu S — Sun, 12 Oct 2025 06:23:13 GMT

Setting up your Thinkpad for Linux

A guide for beginners to get started

If Windows is already installed, do this first

Log into Windows with admin credentials.

Shrink the existing partition that Windows is on as much as possible using the disk management tool (or leave space for Windows if you think you will use it regularly). Do not create a new drive from the unallocated space, that will be done when you install your Linux distribution.
Disable fast startup from power settings.
Disable any Bitlocker encryption. To save space, you may also want to disable hibernation by running powercfg.exe /hibernate off from an Administrator Command Prompt.

Update the settings in your BIOS/UEFI

Boot into the BIOS/UEFI interface by pressing F1 when the laptop is starting up (on some ThinkPad models, you may need to press Enter first, then F1).

Ensure that the Secure Boot and encryption options are disabled (usually found under the Security tab).
For optimal compatibility, you may also want to check the Config or Restart menu and ensure OS Optimized Defaults is Disabled.
Some newer ThinkPads may require setting the Suspend Mode (often under the Power tab) to Linux or "Linux S3" for proper suspend-to-RAM functionality.
Save your changes with F10 and exit.

Create a bootable USB

Download the ISO image for the distribution of your choice (e.g., Ubuntu LTS).

Burn it onto a USB using Rufus or BalenaEtcher.

Install Ubuntu LTS

Boot from the USB: Plug the USB drive into your ThinkPad. When starting the laptop, press F12 immediately to bring up the Boot Menu. Select the USB drive from the list and press Enter.
Start the Installation: Once the USB boots, you will see a screen. Select Try or Install Ubuntu. It's generally recommended to select "Try Ubuntu" first to ensure everything works before committing to the installation, or simply select "Install Ubuntu" to proceed directly.
Follow the Installer Prompts:
- Choose your language and keyboard layout.
- Select your installation type: choose Normal Installation and check the box to Install third-party software for graphics and Wi-Fi hardware (this is important for proprietary drivers like Nvidia graphics or some Wi-Fi cards).
- When prompted for the Installation type, choose "Install Ubuntu alongside Windows Boot Manager" if you want to dual-boot, or "Something else" for a custom partition setup. If you created unallocated space earlier, select that space and create a new partition for Ubuntu (at minimum, one partition for the root directory / is needed; a separate partition for /home and a swap partition are optional but recommended).
- Complete the remaining steps by setting your time zone and creating your user account and password.
- Click Install Now and confirm the changes to the disk.
Restart: Once the installation is complete, select Restart Now and remove the USB drive when prompted. The laptop should now boot into the GRUB bootloader, allowing you to choose between Ubuntu and Windows.

Post-Installation Steps

After successfully logging into your new Ubuntu desktop, you should perform a few initial setup tasks:

Update Your System: Open a terminal (Ctrl+Alt+T) and run the following commands to ensure all your packages are up-to-date:

Bash
```
  sudo apt update
  sudo apt full-upgrade -y
```
Install Drivers (if needed): If you checked the "third-party software" box, most drivers should be installed. However, if you have an Nvidia GPU or encounter issues with Wi-Fi, go to Software & Updates → Additional Drivers tab and check if any proprietary drivers are available and enabled.
Configure TLP Battery Charge Thresholds:

TLP is essential for optimizing battery life and health on ThinkPads. To protect your battery, you can set a custom charge threshold to stop charging before it reaches 100%.
1. Install TLP and TLP-RDW:
  
  Bash
```
 sudo apt install tlp tlp-rdw
```
2. Edit the TLP Configuration File:
  
  Open the configuration file /etc/tlp.conf using a text editor like vim:
  
  Bash
```
 sudo vim /etc/tlp.conf
```
3. Set the Thresholds:
  
  Scroll down to the Battery Care section and find the lines for START_CHARGE_THRESH_BAT0 and STOP_CHARGE_THRESH_BAT0. Uncomment these lines (remove the # symbol) and set them to your desired values:
  - Start Charging Below 60%: Set START_CHARGE_THRESH_BAT0=60
  - Stop Charging at 75%: Set STOP_CHARGE_THRESH_BAT0=75

Your edited lines should look like this (for a single battery):

        # Main / Internal battery (values in %)
        START_CHARGE_THRESH_BAT0=60
        STOP_CHARGE_THRESH_BAT0=75

Save and exit Vim with :wq

Apply and Verify the Settings:

Apply the new configuration and start TLP:

Bash
```
 sudo tlp start
```
You can verify the settings were applied successfully by checking the battery status:

Bash
```
 sudo tlp-stat -b
```
Look for the output showing the configured charge_start_threshold and charge_stop_threshold values.

Final Check: Test closing the laptop lid to ensure the system correctly enters and resumes from a low-power state (suspend-to-RAM or S3 state).

Configuring Fingerprint Authentication

Configuring fingerprint authentication using fprintd and pam-auth-update on a Linux system (like Ubuntu) typically involves three main steps: installing the necessary packages, enrolling your fingerprint, and enabling the authentication module.

1. Install Required Packages

First, ensure your system recognizes your fingerprint reader. You can often check this with the command lsusb. If your device is supported by libfprint, install the required packages:

Bash

sudo apt update
sudo apt install fprintd libpam-fprintd

fprintd: This is the fingerprint matching daemon that communicates with your reader and manages fingerprints.
libpam-fprintd: This is the Pluggable Authentication Module (PAM) component that allows system services to use fprintd for authentication.

2. Enroll Your Fingerprint

Once the packages are installed, you need to enroll your fingerprint so the system can recognize it.

Bash

fprintd-enroll

Follow the on-screen prompts, which will typically ask you to swipe or tap your finger on the sensor multiple times until the enrollment is complete. It will usually specify which finger it is enrolling (e.g., "right-index-finger").

You can also use your desktop environment's User Settings (e.g., in GNOME or KDE) to enroll your fingerprint, which often provides a graphical interface.

3. Enable Fingerprint Authentication via PAM

The pam-auth-update tool configures the system's authentication stack (PAM) to use the new fingerprint module.

Bash

sudo pam-auth-update

This command will open a text-based configuration menu.

Use the Up/Down arrow keys to navigate to "Fingerprint authentication".
Press the Spacebar to place an asterisk (*) next to it, indicating it is selected/enabled.
Use the Tab key to highlight .
Press Enter to save the configuration and exit.

This step integrates the pam_fprintd.so module into the common authentication stack (usually by modifying files like /etc/pam.d/common-auth), allowing it to be used for things like graphical login, screen unlock, and sudo elevation.

4. Test the Configuration

After completing these steps, you should be able to test the fingerprint authentication:

Sudo: Open a terminal and run a sudo command. You should be prompted to either scan your finger or enter your password.
Login/Screen Unlock: Lock your screen or log out. The login prompt should offer a fingerprint option.

Now you can proceed to install all the other packages and utilities you might need, like homebrew for package management.

Digitalizing checklists with Google Apps Script

Vishnu S — Thu, 21 Aug 2025 15:34:20 GMT

The main issue with manual checklists is that they are a pain to use. The friction of having to print out a checklist every time that you need to use it is bound to eventually impact the frequency of its use as it provides an unsatisfactory user experience.

If enough users feel that way, then the whole point of the checklist is moot. That is why we should automate these kinds of workflows as early as possible, which will improve compliance with the new workflow by reducing the friction involved in using checklists.

We will have 4 major stages in converting these checklists into forms:

Convert the PDFs into a structured representation like JSON or CSV that we can directly use in code
Write some Apps Script (Google’s flavor of Javascript) using the Apps Script platform to programmatically create the form.
Write some Apps Script to manage some maintenance tasks, like keeping the Google Sheets backend a manageable size.
Write an ingest pipeline that uses Apps Script to scan a folder in Google Drive for new checklists and incorporate them to create a new version of the form (using LLMs for processing the PDFs into JSON).

Memory mapped areas for Lucene

Vishnu S — Thu, 21 Aug 2025 02:15:14 GMT

When deploying search infrastructure at scale, especially with technologies like Apache Lucene or Elasticsearch, performance tuning often goes beyond application-level optimizations. One of the most critical yet frequently overlooked system parameters is vm.max_map_count. This Linux kernel setting governs how many virtual memory areas (VMAs) a single process can allocate, and it plays a pivotal role in how Lucene interacts with the operating system to manage large search indexes efficiently.

Lucene is a high-performance, full-text search library that underpins many modern search platforms. One of its key architectural choices is the use of memory-mapped files. Instead of reading index files into memory using traditional I/O operations, Lucene maps these files directly into the process’s virtual address space. This is done using the mmap system call, which allows Lucene to access file contents as if they were part of memory. The operating system handles the actual loading of data into RAM, doing so lazily—only when the application accesses specific portions of the file.

This design is elegant and efficient. It avoids the overhead of manual file reads and buffering, and it allows Lucene to work with very large indexes without consuming excessive heap memory. However, it introduces a dependency on the operating system’s ability to manage a large number of memory mappings. Each memory-mapped file segment creates a virtual memory area in the kernel, tracked by a data structure called vm_area_struct. These structures reside in RAM and consume kernel memory. The vm.max_map_count parameter sets a hard limit on how many of these mappings a single process can have.

By default, most Linux distributions set this value to 65,530. While this is sufficient for many applications, it quickly becomes a bottleneck for Lucene-based systems, especially when dealing with large indexes or numerous shards. To understand why, consider how Lucene structures its indexes. Each index is composed of multiple segments, and each segment consists of several files—such as term dictionaries, postings lists, and stored fields. Lucene typically maps these files in chunks of up to 1 GiB. A single index with dozens of segments can easily require hundreds or thousands of mappings. In Elasticsearch, where a single node might host thousands of shards, the number of required mappings can grow exponentially. If the process exceeds the vm.max_map_count limit, it will fail to create new mappings, leading to errors like java.io.IOException: Map failed.

Increasing vm.max_map_count allows Lucene and Elasticsearch to scale more effectively by enabling more memory-mapped files per process. However, this change has implications for system memory usage. While increasing the mapping count does not directly increase the amount of RAM allocated to the kernel, it does allow more VMAs to be created, each of which consumes kernel memory. On average, each VMA uses about 128 bytes of RAM, plus additional overhead from the kernel’s memory allocator. For example, increasing the limit to 262,144 mappings—a common recommendation for Elasticsearch—would consume roughly 33.5 MB of kernel memory. On a system with 8 GB of RAM, this is a negligible amount, but it’s important to understand that this memory is taken from the same physical RAM pool used by user-space applications.

This leads to an important distinction: memory-mapped files do impact the total memory an application can consume, but they behave differently from traditional heap or stack allocations. When a file is memory-mapped, it is not immediately loaded into RAM. Instead, the operating system loads pages into memory on demand, as the application accesses them. This lazy loading mechanism means that the mapped file’s size does not directly translate to RAM usage. Only the accessed portions occupy physical memory, and these pages are managed by the OS’s page cache. This allows Lucene to work with very large indexes without exhausting heap memory or requiring manual memory management.

From the application’s perspective, memory-mapped files expand the process’s virtual memory footprint. On 64-bit systems, the address space is vast, so this is rarely a constraint. However, each mapping still requires kernel bookkeeping, and if many mappings are created, the kernel memory usage can grow. This indirectly reduces the amount of RAM available for other applications, especially if multiple memory-intensive processes are running concurrently.

In essence, memory-mapped files in Lucene act like a giant pointer table into the index files stored on disk. Instead of reading data into buffers, Lucene uses these mappings to navigate the index structure efficiently. The operating system handles the actual data loading, caching, and eviction, allowing Lucene to focus on search logic. This design is elegant and powerful, but it depends heavily on the system’s ability to support a large number of mappings.

For systems with 8 GB of RAM, setting vm.max_map_count to 262,144 is generally safe and recommended. This value provides a generous buffer for Lucene’s mapping needs without significantly impacting other applications. To apply this setting temporarily, one can use:

sudo sysctl -w vm.max_map_count=262144

To make it persistent across reboots:

echo "vm.max_map_count=262144" | sudo tee -a /etc/sysctl.conf
sudo sysctl -p

Ultimately, tuning vm.max_map_count is about enabling Lucene to scale efficiently while maintaining system stability. It is a small change with a big impact, especially in environments where search performance and reliability are paramount. By understanding how memory-mapped files work and how they interact with kernel memory, developers and system administrators can make informed decisions that optimize both application behavior and system resource usage.

Generative AI for learning by quizzing

Vishnu S — Wed, 20 Aug 2025 17:49:00 GMT

Research has shown that being quizzed and actively engaging with the material is the best way to learn and retain the content that was learnt. But that doesn’t mean that all quizzes are created equal. There is a “Goldilocks” zone of sorts for the difficulty of the questions, where the questions are neither so easy that you can answer them without thinking, nor so hard that you get demotivated to even try answering them.

That sweet spot of difficulty keeps learners on their toes, coming back for more questions, which will ultimately help their understanding of the material.

Generating questions of different difficulty levels

RAG for getting relevant subject matter

Using graphs to track subject matter coverage and concept hierarchies

Generating questions from a text corpus involves a rich interplay of natural language processing (NLP), large language models (LLMs), and graph-based methods. This process can be tailored to produce various types of questions—factual, inferential, multiple-choice, and open-ended—depending on the intended use case, such as educational assessment, conversational AI, or content enrichment.

The process begins with preprocessing the corpus. Text is segmented into sentences and paragraphs, cleaned to remove noise, and analyzed to identify key entities and concepts using techniques like Named Entity Recognition (NER), coreference resolution, and keyword extraction. These steps help isolate the most informative parts of the text.

Next, candidate sentences are selected based on their semantic richness. Sentences containing definitions, causal relationships, or important facts are prioritized using scoring methods such as TF-IDF or contextual embeddings from models like BERT. These sentences serve as the foundation for question generation.

For factual and multiple-choice questions, several approaches can be used. Rule-based systems apply syntactic parsing to extract subject-verb-object structures and transform them into questions using predefined templates. For example, from “The Eiffel Tower is in Paris,” a rule might generate “Where is the Eiffel Tower located?” Neural models like T5 and BART, fine-tuned for question generation, can produce more flexible and abstractive questions from input passages. Additionally, models trained on datasets like SQuAD can extract question-answer pairs directly from text. Multiple-choice questions require an additional step: generating distractors. These can be selected using semantic similarity measures (e.g., WordNet or embedding distances) or entity type matching to ensure plausible alternatives.

Open-ended question generation, however, demands a deeper semantic understanding. These questions aim to elicit reasoning, interpretation, or synthesis rather than recall. To generate them, one must first identify conceptual anchors—themes, causal links, or contrasting ideas—using topic modeling (e.g., LDA or BERTopic), semantic clustering, or graph-based representations. Concept graphs, where nodes represent ideas and edges represent relationships, help pinpoint areas suitable for deeper inquiry.

LLMs play a central role in crafting open-ended questions. By prompting models like GPT-4 or T5 with context-rich passages and directives such as “Generate a question that encourages critical thinking,” one can produce questions that invite analysis or reflection. Few-shot prompting, where examples of open-ended questions are provided, can further guide the model’s output. For instance, from a passage on climate change, the model might generate “How might local communities adapt to the long-term effects of rising sea levels?”—a question that encourages exploration of implications and strategies.

Graph-based methods can also be used to refine and diversify the question set. By constructing graphs of generated questions and analyzing their semantic similarity, one can cluster and prune redundant questions. Centrality measures help identify questions that touch on key ideas or bridge different concepts, ensuring broad and meaningful coverage.

Finally, generated questions are evaluated and ranked based on clarity, relevance, and depth. Readability metrics, semantic alignment with the source text, and classification based on Bloom’s taxonomy can be used to assess the quality of each question.

In practice, this integrated approach allows for scalable and context-aware question generation. For example, given a corpus of science articles, one could extract factual sentences, use T5 to generate “What” or “Why” questions, rank them by relevance, and optionally generate distractors for multiple-choice formats. For open-ended questions, concept graphs and LLMs can be used to produce prompts that encourage critical thinking and discussion.

Identifying the difficulty level of questions

The original idea was to generate pairs for each question in the set with another question from the set, and then compare if question A was tougher than question B. If it was, then the tougher question would be assigned a point. Ultimately, the questions could be ranked by the number of points they would have accumulated. However, even for a small data set of 100 questions, that amounts to a comparison of

The Bradley-Terry Model

The Bradley-Terry model is one way to rank questions based on pairwise comparisons of their difficulty. Suppose you have a set of items (e.g., questions) $Q_1, Q_2, Q_3,\dots,Q_n$The Bradley-Terry model assigns each item a difficulty score $\theta_i$ The probability that item $i$ is judged more difficult than item $j$ is:

$$P(i \text{ beats } j) = \frac{\theta_i}{\theta_i + \theta_j}$$

Where:

$\theta_i > 0$ is the latent difficulty of item $i$
The model assumes that comparisons are independent and based only on the relative scores.

The Thurstone Model

The Thurstone model, specifically the Thurstone Case V model, is another approach for ranking items based on pairwise comparisons, similar in spirit to the Bradley-Terry model but based on psychometric theory and Gaussian assumptions.

Thurstone's model assumes that each item (e.g., question) has a latent difficulty value drawn from a normal distribution. When comparing two items, the probability that one is judged more difficult than the other depends on the difference in their latent values.

Let:

$\delta_1$ be the latent difficulty of item $i$
$\delta_j$ be the latent difficulty of item $j$

Then the probability that item $i$ is judged more difficult than item $j$ is:

$$P(i>j) = \frac{\Phi(\delta_i-\delta_j)}{\sqrt{2} \sigma}$$

Where:

$\Phi$ is the cumulative distribution function (CDF) of the standard normal distribution
$\sigma$ is the standard deviation of the latent values (often assumed equal across items)

Feature	Thurstone Model	Bradley-Terry Model
Assumption	Normal distribution of latent traits	Logistic distribution of latent traits
Probability function	Uses normal CDF	Uses logistic function
Interpretation	Psychometric, used in scaling attitudes	Probabilistic, used in sports, ranking
Data requirement	Pairwise comparisons	Pairwise comparisons
Extensions	Can model variance across items	Easier to extend with covariates

Given the above comparison, we will use the Bradley-Terry model, with a subset of manually scored questions to get the difficulty ratings for our entire set of generated questions.

Ensuring the quality of scores

Using a graph to track the distance and connectivity between questions is useful to ensure that the pairs that are manually scored are the right choices for the model to generate unbiased scores.

Each question is treated as a node in a graph, and every pairwise comparison between questions forms an edge connecting two nodes. This graph structure serves multiple purposes.

First, it helps track coverage by ensuring that each question is involved in a sufficient number of comparisons. Questions with fewer comparisons—those with a low degree—should be prioritized when selecting new pairs to compare.

Second, the graph must remain connected to avoid isolated questions or disconnected clusters. Algorithms such as Breadth-First Search (BFS) or Union-Find can be used to verify and maintain this connectivity.

Third, the selection of pairs should aim to be informative. Comparisons between questions that span distant regions of the graph—such as those differing significantly in difficulty—are more valuable than repeated comparisons between similar or adjacent questions.

Finally, the graph should be updated dynamically as more comparisons are collected. This allows for adaptive sampling, where the next pair to compare is chosen based on criteria such as uncertainty in estimated scores. Techniques like entropy or variance can guide this process to focus on the most informative comparisons.

Understanding the competence of the student

Once you have a list of questions sorted by difficulty, you can apply Item Response Theory (IRT) to estimate a student's competence level, commonly referred to as ability or proficiency in IRT terminology.

IRT is a family of statistical models that describe how the probability of a correct response to a question depends on both the student's latent ability and the characteristics of the question itself. In its simplest form—the one-parameter logistic model (1PL), also known as the Rasch model—the probability that a student with ability level θ answers a question correctly is given by:

$$P(\text{correct}) = \frac{1}{1 + e^{-(\theta - b_i)}}$$

Here, θ represents the student's ability, and bibi is the difficulty of question ii. More complex models extend this by adding parameters: the two-parameter logistic model (2PL) introduces a discrimination factor aiai, which reflects how well a question distinguishes between students of different abilities, and the three-parameter logistic model (3PL) adds a guessing factor cici, accounting for the chance of answering correctly by guessing.

To use IRT in practice, you begin by assigning difficulty scores to your questions, which can be derived from models like Bradley-Terry or empirical performance data. You then administer a subset of these questions to a student and record their responses. By fitting an IRT model to this data, you can estimate the student's ability level θ.

This estimated ability can be used to place the student on a proficiency scale, recommend the next most suitable questions, and make comparisons across students. IRT is particularly powerful because it simultaneously accounts for both question difficulty and student ability, enabling adaptive testing and yielding more accurate and individualized assessments than raw scores alone.

A RAG-powered article publishing pipeline

Vishnu S — Fri, 08 Aug 2025 15:07:17 GMT

Motivation

For some time now, I have been maintaining a Zettelkasten to capture notes on topics I explore, whether through my work or personal learning. Over the years, this system has grown into a substantial network of interconnected notes. The strength of the Zettelkasten lies in its ability to reveal relationships between ideas that I might never have discovered otherwise. This led me to wonder: what if I could distill these connections into cohesive articles and share the insights they reveal?

Because a Zettelkasten naturally organizes information as a graph, there is an opportunity to use a large language model (LLM) to enhance the writing process. The idea is to have the LLM analyze the graph structure, surface related content already in the system, and use that context to generate a structured article outline that highlights the key points to cover. From there, I can develop the full piece in my own writing style. This could significantly accelerate my publishing workflow, especially when the foundational research and projects are already in place.

Microsoft’s GraphRAG offers a similar concept. It applies graph-based retrieval to deliver more complete and contextually grounded answers from a text corpus by leveraging relationships between nodes and the entities they refer to. While my implementation will be far simpler than GraphRAG, my goal is to achieve a similar benefit: articles that are well-structured, comprehensive, and deeply connected across diverse topics.

Solution Design

Requirements

Core content generation capabilities
- Generate structured article skeletons from the Zettelkasten graph, identifying key sections and points based on the strongest links between notes
- Continuously reassess and update suggested outlines as new content is ingested, ensuring the article ideas evolve with the knowledge base
- Identify gaps in content completeness and flag areas that require additional research before articles can be developed
Content management and tracking
- Maintain a comprehensive catalogue of all suggested articles, with clear indicators of their progress stages (idea, research in progress, writing, complete)
- Track published articles and prevent duplicate or redundant suggestions
- Preserve historical versions of article outlines to monitor the evolution of ideas over time
Graph-based analysis and prioritization
- Model the Zettelkasten as a graph with nodes (notes) and edges (links) to uncover clusters of related information suitable for article development
- Rank article suggestions by relevance, completeness, novelty, and expected value to readers
- Detect emerging thematic trends and highlight cross-domain connections that present unique insights
- Recommend priorities for article creation based on timeliness, novelty, and alignment with ongoing projects
System integration
- Support ingestion from existing Zettelkasten formats such as Markdown files or Obsidian vaults, with bidirectional synchronization to keep the graph representation current
- Integrate with a large language model to generate article outlines and summaries, allowing customization of prompts to maintain consistent writing style and tone
User interface and workflow management
- Provide an intuitive UI dashboard where users can:
  - View, search, and filter suggested article ideas by topic, status, or required research effort
  - Track progress of articles through various stages from ideation to completion
  - Review, approve, modify, or reject suggested article outlines before moving to writing
  - Access historical versions of outlines and compare their evolution
- Enable configurable outline detail levels within the UI, allowing users to control the granularity of article skeletons
- Offer export options for article outlines in Markdown or other preferred formats, including links back to the original notes in the Zettelkasten

Implementation

Ingestion Pipeline

Content Completeness Evaluation Pipeline

Article Draft Generation Pipeline

Deployment

Evaluation

How Does Search Work?

Vishnu S — Fri, 08 Aug 2025 15:04:23 GMT

Crawling/Ingestion

Indexing

The fundamental data structure involved in search is the inverted index.

Retrieval

Keyword Search

Vector Search

There are 2 common approaches for performing vector search in a vector store. Hierarchical Navigable Small World (HNSW) and Inverted File Index (IVF). These approaches allow for quick nearest neighbour search of vectors to identify those that are similar based on cosine similarity or

Re-ranking and Rank Fusion

Personalisation

Collaborative Filtering

Collaborative filtering works on the principle of finding what users like the target user prefer, and recommending those content

Content-based Filtering

Content-based filtering

Evaluation

Search is generally ranked along recall, which is whether the right content is returned in the top K results, and relevance, which has to do with the position of the right content in the top K results.

Recall can be measured via a simple ratio:

$$\frac{\sum \text{Results returned}}{\sum \text{Queries made}} \times 100\%$$

Relevance can be measured with Mean Average Rank (MAR), which looks at the position of relevant results in the returned top K results and calculates a weighted average based on position. This penalises irrelevant results

A better metric is Normalized Discounted Cumulative Gain (NDCG), which accounts for both recall and relevance, and penalises relevant results that are not ranked the top more heavily.

References

How search works by Google