Torturing my supercomputer. Illustration that the GPU is not just for machine learning and some complex math.
My script takes a thick English dictionary (Webster) and multiplies it by 30, creating a list of 12 million words. Then, the algorithm looks through all 12 million words and replaces all the vowels with asterisks using regex. To add more load, a “word length” column is added, and then we take words longer than 10 letters and find the most frequent (top 5).
So, in Python this is
df[‘masked’] = df[‘text’].str.replace(r'[aeiou]’, ‘*’, regex=True)
df[‘len’] = df[‘masked’].str.len()
res = df[df[‘len’] > 10][‘masked’].value_counts().head(5)
and this code is executed first through the main processor, then through a GPU.
The main processor (I have the top-tier Intel i9 285k) completes this task in 24 seconds, while the Nvidia RTX 5090 does it in 0.51 seconds. That’s a 46 times difference!
[Pandas CPU] Top Patterns:
masked
s*r w. sc*tt. 23280
s*r t. br*wn*. 23220
j*r. t*yl*r. 16140
bl*ckst*n*. 10860
b***. & fl. 10830
Name: count, dtype: int64
[Pandas CPU] Computation Time: 23.5596 sec.
Transferring data to GPU…
Transfer complete in 1.16s
— Running Benchmark: cuDF GPU —
[cuDF GPU] Top Patterns:
masked
s*r w. sc*tt. 23280
s*r t. br*wn*. 23220
j*r. t*yl*r. 16140
bl*ckst*n*. 10860
b***. & fl. 10830
Name: count, dtype: int64
[cuDF GPU] Computation Time: 0.5108 sec.
TOTAL SPEEDUP: 46.12x

