Free to Use

🧬 Gene Frequency Calculator

Calculate allele frequencies using Hardy-Weinberg principles. Analyze two-allele and three-allele systems, perform chi-square tests for HWE, compute carrier frequencies, and handle X-linked genes with step-by-step solutions.

Number of AA individuals in the population
Number of Aa individuals
Number of aa individuals
Individuals with dominant phenotype (AA + Aa)
Individuals with recessive phenotype (aa)
Auto-calculated if left empty
Frequency of affected individuals (e.g., 1/2500 = 0.0004)
Direct input of q if known (0–1)
Optional: calculates expected counts

Real-World Gene Frequency Examples

📊 Hardy-Weinberg — Cystic Fibrosis

Problem: In a population of 10,000 people, 1 in 2,500 has cystic fibrosis (an autosomal recessive disorder). What are the allele frequencies and carrier frequency?

Solution: q² = 1/2500 = 0.0004

q = √0.0004 = 0.02 (2% recessive allele frequency)

p = 1 − q = 0.98 (98% dominant allele frequency)

Carrier frequency (2pq) = 2 × 0.98 × 0.02 = 0.0392 ≈ 3.9%

This means about 392 out of 10,000 people are carriers (heterozygous) for cystic fibrosis, even though only 4 have the disease.

📊 Two-Allele Frequency from Genotype Counts

Problem: In a population of 100 individuals, there are 36 AA, 48 Aa, and 16 aa. Calculate allele frequencies and test for HWE.

Solution:

Total alleles = 2 × 100 = 200

Count of A = 2×36 + 48 = 120 → p = 120/200 = 0.60

Count of a = 2×16 + 48 = 80 → q = 80/200 = 0.40

Expected: AA = p² × 100 = 36, Aa = 2pq × 100 = 48, aa = q² × 100 = 16

χ² = (36−36)²/36 + (48−48)²/48 + (16−16)²/16 = 0.00 (Population in HWE)

🧬 X-Linked Recessive — Color Blindness

Problem: Red-green color blindness is an X-linked recessive trait. In a population, 8 out of 50 males are color blind. Estimate the frequency of the recessive allele.

Solution: In males, the phenotype directly reflects the genotype (hemizygous).

q (a) = 8/50 = 0.16 (16%)

p (A) = 1 − 0.16 = 0.84 (84%)

Expected female carrier frequency (2pq) = 2 × 0.84 × 0.16 = 26.9%

For X-linked traits, allele frequency in males equals phenotype frequency because males have only one X chromosome.

📊 Chi-Square Test for HWE Deviation

Problem: In a population of 200 individuals, observed counts are: AA = 80, Aa = 40, aa = 80. Is the population in HWE?

Solution: p = (2×80 + 40) / 400 = 0.50, q = 0.50

Expected: AA = 0.25 × 200 = 50, Aa = 0.50 × 200 = 100, aa = 0.25 × 200 = 50

χ² = (80−50)²/50 + (40−100)²/100 + (80−50)²/50 = 18 + 36 + 18 = 72 (p < 0.001)

With χ² = 72 and df = 1, p < 0.001 — significant deviation from HWE. This population may be experiencing selection, non-random mating, or other evolutionary forces.

📊 Three-Allele System — ABO Blood Types

Problem: In a population, the observed genotype counts for a three-allele system (alleles 1, 2, 3) are: 11=25, 12=30, 13=15, 22=10, 23=12, 33=8. Find allele frequencies.

Solution: Total individuals = 25+30+15+10+12+8 = 100

p (allele 1) = (2×25 + 30 + 15) / 200 = 95/200 = 0.475

q (allele 2) = (2×10 + 30 + 12) / 200 = 62/200 = 0.310

r (allele 3) = (2×8 + 15 + 12) / 200 = 43/200 = 0.215

Sum check: 0.475 + 0.310 + 0.215 = 1.000 ✓

Gene Frequency Formulas & Guide

Hardy-Weinberg Equilibrium

p + q = 1   and   p² + 2pq + q² = 1
Allele frequency and genotype frequency equations for a two-allele system

Where p is the frequency of the dominant allele (A), q is the frequency of the recessive allele (a), is the frequency of homozygous dominant (AA), 2pq is the frequency of heterozygous (Aa), and is the frequency of homozygous recessive (aa).

Chi-Square Test for HWE

χ² = Σ (O − E)² / E
Degrees of freedom = number of genotypes − number of alleles

The chi-square test compares observed genotype counts (O) with expected counts under HWE (E). For a two-allele system with three genotypes, df = 1. A significant χ² (p < 0.05) indicates the population deviates from Hardy-Weinberg equilibrium, suggesting evolutionary forces at work.

Three-Allele System

p + q + r = 1
With six genotypes: p², 2pq, 2pr, q², 2qr, r²

For a gene with three alleles, allele frequencies are calculated by counting each allele copy from the genotype counts: p = (2×n₁₁ + n₁₂ + n₁₃) / 2N, q = (2×n₂₂ + n₁₂ + n₂₃) / 2N, r = (2×n₃₃ + n₁₃ + n₂₃) / 2N, where N is the total number of individuals.

Phenotype-Based Estimation

q = √(recessive phenotype frequency)
p = 1 − q, assuming HWE and complete dominance

When genotype counts are unknown but phenotype counts are available, the recessive allele frequency can be estimated as the square root of the recessive phenotype frequency. This method assumes Hardy-Weinberg equilibrium and complete dominance.

X-Linked Genes

Males: p = freq(XᴬY), q = freq(XᵃY)
Females: p = (2×AA + Aa) / 2N, q = (2×aa + Aa) / 2N

For X-linked genes, males are hemizygous — their allele frequency equals phenotype frequency directly. Female allele frequencies are calculated from genotype counts as in autosomal genes. The combined population frequency can be weighted by the sex ratio.

Key Concepts

📌 Carrier Frequency

The carrier frequency (2pq) represents the proportion of heterozygous individuals in a population. For recessive disorders, carriers are unaffected but can pass the disease allele to their offspring. Carrier frequency is always much higher than disease frequency.

📌 HWE Assumptions

Hardy-Weinberg equilibrium requires: (1) random mating, (2) no mutation, (3) no natural selection, (4) large population size (no genetic drift), and (5) no gene flow. Real populations rarely satisfy all conditions perfectly.

📌 Interpreting χ²

A small χ² (p > 0.05) suggests the population is in HWE. A large χ² (p < 0.05) indicates significant deviation from HWE — possible causes include selection, inbreeding, population stratification, or genotyping errors.

📌 Heterozygote Advantage

Some genetic disorders persist at high frequencies because carriers have a selective advantage. Example: Sickle cell trait (HbAS) confers resistance to malaria, maintaining the HbS allele at higher frequencies in malaria-endemic regions.

How to Use the Gene Frequency Calculator

1
Choose input mode: Select from Two-Allele, Three-Allele, Phenotype-Based, X-Linked, or Carrier Frequency mode.
2
Enter observed data: Input genotype counts, phenotype counts, or disease frequency depending on the selected mode.
3
Calculate: Click the calculate button to compute allele frequencies, expected genotype frequencies, and the chi-square test for HWE.
4
Review results: Check allele frequencies (p, q, r), expected genotype frequencies, chi-square statistic, and the HWE conclusion.
5
Follow the steps: Review the step-by-step solution to understand how each value was calculated.
📊
Two-Allele Analysis
Calculate p and q allele frequencies from genotype counts. View expected Hardy-Weinberg genotype frequencies and chi-square test results.
📊
Three-Allele System
Extend Hardy-Weinberg analysis to three alleles (p, q, r). Compute all six genotype frequencies with allele frequency sum check.
📋
Phenotype-Based Mode
Estimate allele frequencies from dominant and recessive phenotype counts. Uses the square root method under HWE assumptions.
🧬
X-Linked & Carrier
Handle X-linked genes with separate male/female analysis. Compute carrier frequency from recessive disease frequency.

⚠️ Important Note: Hardy-Weinberg equilibrium assumes ideal conditions (random mating, no mutation, no selection, large population size, no gene flow). Real populations rarely satisfy all conditions perfectly. The chi-square test for HWE is sensitive to sample size — large sample sizes may detect statistically significant but biologically trivial deviations.

Frequently Asked Questions

What is the Hardy-Weinberg principle and why is it important?
The Hardy-Weinberg principle (HW principle) states that allele and genotype frequencies in a population remain constant from generation to generation in the absence of evolutionary influences. It provides a null model for population genetics — if observed genotype frequencies deviate significantly from HW expectations, it suggests that one or more evolutionary forces (selection, mutation, genetic drift, gene flow, non-random mating) are acting on the population. The principle is fundamental to population genetics, conservation biology, and medical genetics.
How do I calculate allele frequency from genotype counts?
For a gene with two alleles (A and a), calculate allele frequencies from genotype counts as follows: p = (2 × NAA + NAa) / (2 × N) and q = (2 × Naa + NAa) / (2 × N), where N is the total number of individuals. Each homozygous individual contributes 2 copies of their allele, while each heterozygous individual contributes 1 copy of each allele. Since p + q = 1, you can also calculate q = 1 − p after finding p.
What does a significant chi-square test mean for HWE?
A significant chi-square test (p < 0.05) indicates that the observed genotype frequencies deviate significantly from Hardy-Weinberg expectations. This suggests that one or more evolutionary forces are affecting the population. Possible causes include: (1) natural selection — certain genotypes have higher fitness, (2) non-random mating — individuals select mates based on genotype (assortative mating), (3) population stratification — the sample contains multiple subpopulations with different allele frequencies (Wahlund effect), (4) inbreeding — increased homozygosity, (5) genotyping errors — technical artifacts. A non-significant result (p > 0.05) means the data are consistent with HWE.
How is carrier frequency different from disease frequency?
Carrier frequency (2pq) is the proportion of heterozygous individuals who carry one copy of a recessive disease allele but do not show symptoms. Disease frequency (q²) is the proportion of homozygous recessive individuals who actually have the disease. For rare recessive disorders (small q), carrier frequency is much higher than disease frequency. For example, if disease frequency is 1/2500 (q² = 0.0004), then q = 0.02, p = 0.98, and carrier frequency = 2 × 0.98 × 0.02 ≈ 0.039 (about 1 in 26), which is about 96 times higher than the disease frequency.
How does X-linked inheritance affect allele frequency calculations?
For X-linked genes, males have only one X chromosome (hemizygous), so their allele frequency equals their phenotype frequency directly. For example, if 8 out of 50 males show a recessive X-linked trait, q = 8/50 = 0.16 for males. Females, having two X chromosomes, follow the standard Hardy-Weinberg equations. This difference means allele frequencies can be estimated separately from males and females, and the combined population frequency depends on the sex ratio. For rare X-linked recessive disorders, the disease is much more common in males than females — if q = 0.16, the trait appears in 16% of males but only 0.16² = 2.56% of females.
What is the difference between count-based and frequency-based gene frequency calculations?
Count-based calculations use raw genotype counts (e.g., 36 AA, 48 Aa, 16 aa) to compute allele frequencies. This method is more accurate because it uses all available data. Frequency-based calculations use the observed genotype frequencies (proportions) directly. Phenotype-based calculations estimate allele frequencies from phenotype counts (dominant vs. recessive), assuming HWE — this is less precise but useful when only phenotype data are available, such as in medical studies where genotyping is impractical.