Reading A Few Lines As Numpy Array From A Huge File

February 09, 2024 Post a Comment

I have a text file that contains a billion words and their corresponding 300 dimensional word vectors.I need to extract a few thousands of words & their words vectors from the

Solution 1:

I don't know this is the fastest (probably not) but it works reasonably well (I tested it on a >100,000 lines file):

F = filter(lambda s: s.strip().split()[0] in word_set if s.strip() elseFalse,
           open(fn, 'rt'))
x = np.genfromtxt(F, *yourargs, **yourkwds)

This is for Python2. In Python3 it seems one has to .encode() the input.

Python Playground

Reading A Few Lines As Numpy Array From A Huge File

Solution 1:

Post a Comment for "Reading A Few Lines As Numpy Array From A Huge File"