Table of Contents
Introduction to Hashing
Introduction to Hashing
- Hashing is used in data compression, search engines, password protection, fraud detection, and securing websites.
- Central to technologies like Blockchain, Image Processing, and Load Balancing.
Functions and Hash Functions
- Functions: Map an input to an output (one-to-one, many-to-one, one-to-many, many-to-many).
- Pure Functions: Always produce the same output for a given input, without side effects.
- Impure Functions: Produce different outputs for the same input due to external factors.
- Hashing: Converts input data to a fixed-length output (hash) using a hash function.
- Hash Functions: Deterministic and produce fixed-length outputs.
Collisions in Hashing
- Collisions: Occur when different inputs produce the same hash output. While hashing algorithms aim to minimize collisions, they are mathematically inevitable due to the limited number of possible hash outputs compared to the infinite number of potential inputs.
- Pigeonhole Principle: Explains why collisions are unavoidable. If you have more items than containers, at least one container must hold more than one item. Similarly, with more possible inputs than unique hash outputs, some inputs will inevitably share the same hash.
- Collision Probability: The chances of encountering a collision with SHA-1 are extraordinarily low. For practical purposes, it is highly unlikely to happen. The probability is comparable to winning a lottery multiple times in a row.
- SHA-1: A popular hashing algorithm used in Git.
Git Initialization
What is a Git Repository?
- A Git repository (repo) is a virtual storage for your project, saving versions of your work and tracking changes.