Today we are looking at how you can work with big integers in Python. What do I mean by “big” integers?

A standard 32-bit integer tops out at 2,147,483,647. A standard 64-bit integer tops out at about 9.2 × 10^18.

The very interesting thing about Python is: it doesn’t have a fixed ceiling for integers!

The only limitation is the machine’s memory. So you can effectively store any integer in a Python int.

>>> 2**64
18446744073709551616
>>> 2**1000
107150860718626732094842504906000181056140481170553360744375038837035105112493612249319837881569585812...

Yes, that’s crazy. Even your phone number can be stored in a Python int without any explicit long long type.

But how?

We’ll look at CPython’s implementation and also implement something similar ourselves to see how it works. We will write a BigInt class and implement addition, subtraction, and multiplication. In CPython, division is more complex and deserves a separate writeup.

Big int in Python

In Python, an int is represented by a PyLongObject. Refer to the CPython source: Objects/longobject.c.

The value is stored in an array called the digit array.

Its structure is:

typedef struct _PyLongValue {
    uintptr_t lv_tag; /* Number of digits, sign, and flags */
    digit ob_digit[1]; /* The actual array of base-2^30 digits */
} _PyLongValue;

struct _longobject {
    PyObject_HEAD
    _PyLongValue long_value;
};

The digit ob_digit[1] looks like an array of size 1, but it is extended as needed.

Question

How do you store 1234567 in a base 1000 system?

Think about it.

Here’s how:

  • Take the number modulo the base (1000 for us).
  • Repeat until the remaining value is zero.

That gives us [567, 234, 1].

That is the representation we work with.

So how does Python handle big integers? It allocates memory for digits as needed and performs arithmetic on those digits. CPython also has smarter optimizations such as lv_tag, but those are out of scope for this article.

Flow

This is the flow of how Python does it:

  1. Break the integer into digits.
  2. Put the digits into the digits array.
  3. Store it that way and do arithmetic on those digits.
Big integer array representation

Code

I wrote a BigInt implementation myself. It handles addition, subtraction, and multiplication of big integer objects.

When looking at the code, imagine 1234567 as an array of digits. That will help you understand it better.

Here’s the GitHub repo: https://github.com/aayushyatiwari/blogCode/tree/master/bigInt

explaining code

I know people don’t go to the github and see the code so I’ll explain in brief how someone would go ahead and write their own BigInt engine.

This is how I thought about it.
What you want to be able to do is call

func main() {
	a := NewBigInt("99999999999999999999")
	b := NewBigInt("1")

	fmt.Println("a        =", a)
	fmt.Println("b        =", b)
	fmt.Println("a + b    =", Add(a, b))
	fmt.Println("a - b    =", Subtract(a, b))
	fmt.Println("a * b    =", Multiply(a, b))
}

The NewBigInt is the object that we want to create.

NewBigInt


// the BigInt struct similar to PyLongObject
type BigInt struct {
	sign   int // -1, 0, or 1
	digits []int
}

// NewBigInt parses a decimal string (e.g. "-12345") into a BigInt.
func NewBigInt(s string) BigInt {
	if len(s) == 0 {
		return BigInt{}
	}
	sign := 1
	if s[0] == '-' {
		sign = -1
		s = s[1:]
	}
	digits := make([]int, len(s))
	for i, ch := range s {
		digits[len(s)-1-i] = int(ch - '0')
	}
	// trim leading zeros
	for len(digits) > 1 && digits[len(digits)-1] == 0 {
		digits = digits[:len(digits)-1]
	}
	if len(digits) == 1 && digits[0] == 0 {
		return BigInt{sign: 0, digits: []int{0}}
	}
	return BigInt{sign: sign, digits: digits}
}

You also need a way to convert BigInt to string to print results.

// String converts a BigInt back to its decimal string representation.
func (b BigInt) String() string {
	if b.sign == 0 || len(b.digits) == 0 {
		return "0"
	}
	out := make([]byte, len(b.digits))
	for i, d := range b.digits {
		out[len(b.digits)-1-i] = byte('0' + d)
	}
	if b.sign == -1 {
		return "-" + string(out)
	}
	return string(out)
}

Operations

While doing addition, subtraction and multiplication you should think about how you’d do it in a standard array.
There are algorithms that are optimized like Karatsuba algorithm (article on it one day) that make the approaches more efficient.

Closing thoughts

It’s interesting that it comes down to the question: did the language designers decide to do the extra work? When I compared Python with C++, I used to think this behavior was some kind of grace.

It’s not grace.

It’s people deciding that programmers should focus on bigger ideas rather than on allocating memory for large integers.

So fascinating!

~ Aayushya