Hex Docs
Search…
Managing Memory in a Hex Project
Users have a finite amount of memory available in their Hex Projects. On rare occasions, you may find that you are running our of memory in your environment.
These errors commonly come in the form of Python MemoryErrors. The MemoryError messages shown in the example below are good indicators that the code in this cell caused the kernel to run out of memory.
Although there is an upper bound on how much memory you will be able to use in any project, there are a few strategies described below to help you reduce the amount of memory your project needs.

Delete Unnecessary Variables

Every variable created in a Hex project is stored in memory until it is deleted. When you no longer need a variable in a project, you can save memory by deleting the variable in a code cell. This example deletes a variable named 'example'.
1
del example
Copied!
The variables that take up the most memory are typically the DataFrame outputs of SQL cells.
To identify the variables taking up the most memory, you can use this Python code:
1
import sys
2
import pandas as pd
3
4
# These are the usual ipython objects, including this one you are creating
5
ipython_vars = ['In', 'Out', 'exit', 'quit', 'get_ipython', 'ipython_vars']
6
7
# Format size of bytes output to human readable format
8
def sizeof_fmt(num, suffix='B'):
9
for unit in ['','K','M','G']:
10
if abs(num) < 1000.0:
11
return "%3.1f %s%s" % (num, unit, suffix)
12
num /= 1000.0
13
return "%.1f %s" % (num, suffix)
14
15
# Get a sorted list of objects and their sizes
16
variables = sorted([(x, sizeof_fmt(sys.getsizeof(globals().get(x))), sys.getsizeof(globals().get(x)) ) for x in dir() if not x.startswith('_') and x not in sys.modules and x not in ipython_vars], key=lambda x: x[1], reverse=True)
17
18
variables_df = pd.DataFrame(variables, columns=['ITEM', 'SIZE', 'SIZE_IN_BYTES'])
19
20
variables_df
Copied!

Save Data to Files

If you have a variable you want to delete but will need to reference it later, you can save it as a file in your project. Any files you write to the working directory in your environment will be saved as part of your project. This allows you to store your data without having to keep it in memory.
For dataframes, a common way to do this is to write them as a CSV in your Python code. This example writes a dataframe to a file called "saved_df.csv" in the project.
1
df.to_csv('saved_df.csv')
Copied!
You can use read_csv from Pandas to read the saved csv back into memory later in your project.
1
import pandas as pd
2
df = pd.read_csv('saved_df.csv', header=0, index_col=0)
Copied!

Modify Data Types

You may be able to shrink the memory usage of a dataframe by changing the data type of some of the columns. You can check the data types in a code cell like this:
1
variables_df.dtypes
Copied!
Based on these types there are a few common conversions you can make to save memory.
    object -> category
      Convert object columns to category columns if there are relatively few unique values in this column compared to the number of rows.
    float64 -> float32
      Convert float64 columns to float32 columns unless you need 16 digits of precision.
    int64 -> int32
      Convert int64 columns to int32 columns unless your data is outside of the range (-2147483648, 2147483648).
You can convert data types in a code cell. This example converts a column called "example_column" to the category data type.
1
df['example_column'] = df['example_column'].astype('category')
Copied!
Last modified 1mo ago