Data section¶
Handling errors¶
lasio
has a flexible way of handling “errors” in the ~ASCII data section to
accommodate how strict or flexible you want to be.
Example errors¶
Here are some examples of errors.
- Files could contain a variety of indicators for an invalid data point other than that defined by
the NULL line in the LAS header (usually -999.25).
- Fixed-width columns could run into each other:
7686.500 64.932 0.123 0.395 12.403 156.271 10.649 -0.005 193.223 327.902 -0.023 4.491 2.074 29.652
7686.000 67.354 0.140 0.415 9.207 4648.011 10.609 -0.004 3778.709 1893.751 -0.048 4.513 2.041 291.910
7685.500 69.004 0.151 0.412 7.020101130.188 10.560 -0.004 60000.000 2901.317 -0.047 4.492 2.046 310.119
7685.000 68.809 0.150 0.411 7.330109508.961 10.424 -0.005 60000.000 2846.619 -0.042 4.538 2.049 376.968
7684.500 68.633 0.149 0.402 7.345116238.453 10.515 -0.005 60000.000 2290.275 -0.051 4.543 2.063 404.972
7684.000 68.008 0.144 0.386 7.682 4182.679 10.515 -0.004 3085.681 1545.842 -0.046 4.484 2.089 438.195
- Odd text such as
(null)
:
8090.00 -999.25 -999.25 -999.25 0 0 0 0 0 0 0 0
8091.000 0.70 337.70 (null) 0 0 0 0 0 0 0 0
8092.000 -999.25 -999.25 -999.25 0 0 0 0 0 0 0 0
Handling run-on errors¶
lasio detects and handles these problems by default using lasio.read(f, read_policy='default')
.
For example a file with this data section:
~A
7686.000 67.354 0.140 0.415 9.207 4648.011 10.609
7685.500 69.004 0.151 0.412 7.020101130.188 10.560
7685.000 68.809 0.150 0.411 7.330-19508.961 10.424
7684.500 68.633 0.149 0.402 7.345116238.453 10.515
7684.000 68.008 0.144 0.386 7.682 4182.679 10.515
is loaded by default as the following:
In [9]: las = lasio.read('tests/examples/null_policy_runon.las')
In [12]: las.data
Out[12]:
array([[7686.0, 67.354, 0.14, 0.415, 9.207, 4648.011, 10.609],
[7685.5, 69.004, 0.151, 0.412, nan, nan, 10.56],
[7685.0, 68.809, 0.15, 0.411, 7.33, -19508.961, 10.424],
[7684.5, 68.633, 0.149, 0.402, nan, nan, 10.515],
[7684.0, 68.008, 0.144, 0.386, 7.682, 4182.679, 10.515]])
Handling invalid data indicators automatically¶
These are detected by lasio to a degree which you can control with the null_policy keyword argument.
You can specify a policy of ‘none’, ‘strict’, ‘common’, ‘aggressive’,
or ‘all’. These policies all include a subset of pre-defined substitutions.
Or you can give your own list of substitutions. Here is the list of
predefined policies and substitutions from lasio.defaults
.
Policies that you can pick with e.g. null_policy='common'
:
NULL_POLICIES = {
'none': [],
'strict': ['NULL', ],
'common': ['NULL', '(null)', '-',
'9999.25', '999.25', 'NA', 'INF', 'IO', 'IND'],
'aggressive': ['NULL', '(null)', '--',
'9999.25', '999.25', 'NA', 'INF', 'IO', 'IND',
'999', '999.99', '9999', '9999.99' '2147483647', '32767',
'-0.0', ],
'all': ['NULL', '(null)', '-',
'9999.25', '999.25', 'NA', 'INF', 'IO', 'IND',
'999', '999.99', '9999', '9999.99' '2147483647', '32767', '-0.0',
'numbers-only', ],
'numbers-only': ['numbers-only', ]
}
Or substitutions you could specify with e.g. null_policy=['NULL', '999.25', 'INF']
:
NULL_SUBS = {
'NULL': [None, ], # special case to be handled
'999.25': [-999.25, 999.25],
'9999.25': [-9999.25, 9999.25],
'999.99': [-999.99, 999.99],
'9999.99': [-9999.99, 9999.99],
'999': [-999, 999],
'9999': [-9999, 9999],
'2147483647': [-2147483647, 2147483647],
'32767': [-32767, 32767],
'NA': [(re.compile(r'(#N/A)[ ]'), ' NaN '),
(re.compile(r'[ ](#N/A)'), ' NaN '), ],
'INF': [(re.compile(r'(-?1\.#INF)[ ]'), ' NaN '),
(re.compile(r'[ ](-?1\.#INF)'), ' NaN '), ],
'IO': [(re.compile(r'(-?1\.#IO)[ ]'), ' NaN '),
(re.compile(r'[ ](-?1\.#IO)'), ' NaN '), ],
'IND': [(re.compile(r'(-?1\.#IND)[ ]'), ' NaN '),
(re.compile(r'[ ](-?1\.#IND)'), ' NaN '), ],
'-0.0': [(re.compile(r'(-?0\.0+)[ ]'), ' NaN '),
(re.compile(r'[ ](-?0\.0+)'), ' NaN '), ],
'numbers-only': [(re.compile(r'([^ 0-9.\-+]+)[ ]'), ' NaN '),
(re.compile(r'[ ]([^ 0-9.\-+]+)'), ' NaN '), ],
}
You can also specify substitutions directly. E.g. for a file with this data section:
~A DEPTH DT RHOB NPHI SFLU SFLA ILM ILD
1670.000 9998 2550.000 0.450 123.450 123.450 110.200 105.600
1669.875 9999 2550.000 0.450 123.450 123.450 110.200 105.600
1669.750 10000 ERR 0.450 123.450 -999.25 110.200 105.600
Ordinarily it would raise an exception:
In [13]: las = lasio.read('tests/examples/null_policy_ERR.las')
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
~\Code\lasio\lasio\reader.py in read_file_contents(file_obj, regexp_subs, value_null_subs, ignore_data)
271 try:
--> 272 data = read_data_section_iterative(file_obj, regexp_subs, value_null_subs)
273 except:
~\Code\lasio\lasio\reader.py in read_data_section_iterative(file_obj, regexp_subs, value_null_subs)
348
--> 349 array = np.fromiter(items(file_obj), np.float64, -1)
350 for value in value_null_subs:
ValueError: could not convert string to float: 'ERR'
During handling of the above exception, another exception occurred:
LASDataError Traceback (most recent call last)
<ipython-input-13-0cb27623119d> in <module>()
----> 1 las = lasio.read('tests/examples/null_policy_ERR.las')
~\Code\lasio\lasio\__init__.py in read(file_ref, **kwargs)
41
42 '''
---> 43 return LASFile(file_ref, **kwargs)
~\Code\lasio\lasio\las.py in __init__(self, file_ref, **read_kwargs)
76
77 if not (file_ref is None):
---> 78 self.read(file_ref, **read_kwargs)
79
80 def read(self, file_ref,
~\Code\lasio\lasio\las.py in read(self, file_ref, ignore_data, read_policy, null_policy, ignore_header_errors, **kwargs)
106
107 self.raw_sections = reader.read_file_contents(
--> 108 file_obj, regexp_subs, value_null_subs, ignore_data=ignore_data, )
109
110 if hasattr(file_obj, "close"):
~\Code\lasio\lasio\reader.py in read_file_contents(file_obj, regexp_subs, value_null_subs, ignore_data)
274 raise exceptions.LASDataError(
275 traceback.format_exc()[:-1] +
--> 276 ' in data section beginning line {}'.format(i + 1))
277 sections[line] = {
278 "section_type": "data",
LASDataError: Traceback (most recent call last):
File "C:\Users\kent\Code\lasio\lasio\reader.py", line 272, in read_file_contents
data = read_data_section_iterative(file_obj, regexp_subs, value_null_subs)
File "C:\Users\kent\Code\lasio\lasio\reader.py", line 349, in read_data_section_iterative
array = np.fromiter(items(file_obj), np.float64, -1)
ValueError: could not convert string to float: 'ERR' in data section beginning line 43
But if we specify the regular expression to use with re.sub()
, we can
easily load it:
In [14]: las = lasio.read('tests/examples/null_policy_ERR.las', null_policy=[('ERR', ' NaN '), ])
In [16]: las.data
Out[16]:
array([[1670.0, 9998.0, 2550.0, 0.45, 123.45, 123.45, 110.2, 105.6],
[1669.875, 9999.0, 2550.0, 0.45, 123.45, 123.45, 110.2, 105.6],
[1669.75, 10000.0, nan, 0.45, 123.45, -999.25, 110.2, 105.6]])
In [17]:
See tests/test_null_policy.py
(link) for some examples.