Read data from XLSX provided as XSTRING - odata

An Excel file (.xlsx) is uploaded on the frontend which is UI5 Fiori.
The file contents come to SAP ABAP backend via ODATA in XSTRING format.
I need to store that XSTRING into an internal table and then in a DDIC table. Eg: Suppose the Excel has 5 columns then I want to store that data of 5 columns in the corresponding columns in the DDIC table.
I have tried various Function Modules like:
SCMS_XSTRING_TO_BINARY
SCMS_BINARY_TO_STRING
and following Classes & methods:
cl_bcs_convert=>raw_to_string
cl_soap_xml_helper=>xstring_to_string
but none were able to convert the XSTRING to STRING.
Can you please suggest which function module or class/method can be used to solve the problem?

For most comfort, use abap2xlsx.
If you cannot or do not want to use that, you can alternatively parse the Excel file on your own. .xlsx files are basically .zip files with a different file ending. Use cl_abap_zip->load to open the xstring you receive and ->get to extract the individual files from the zip. Afterwards, use XML parsers like cl_ixml or transformations to parse the XML content of the files.
Note that Excel's XML is a complicated file format, with several files that work together to form the worksheets. Refer to Microsoft's File format reference for Word, Excel, and PowerPoint for details. It's non-trivial to interpret this, so you will usually be a lot happier with abap2xlsx.

abap2xlsx is the most powerful and feature-rich way of doing this, as said by Florian, it supports styles, charts, complex tables, however it may not be always available due to the system limitations, restrictions to install custom packages in system or whatever.
Here is the way how to accomplish this with pure standard without using custom frameworks.
Since Netweaver 7.02 SAP supports Open Microsoft formats natively and provides classes for handling them: CL_XLSX_DOCUMENT, CL_DOCX_DOCUMENT and CL_PPTX_DOCUMENT, abap2xlsx is built at these classes too, yes. So let's start a bit of reinventing the wheel.
XLSX file is an OpenXML archive of files, of which the most interesting: sheet1.xml and sharedStrings.xml. Let's build a sample based on MARC table fields
Now you want to transfer this table to internal table with the same structure. The steps would be:
Extract needed files from XLSX archive
Read worksheet structure from sheet1.xml
Read sheet values from sharedStrings.xml
Map them together and write the result to the internal table
Here is the sample class that handles the job, I used the cl_openxml_helper applet to load XLSX, but you can receive XSTRINGed XLSX in whatever way.
CLASS xlsx_reader DEFINITION.
PUBLIC SECTION.
TYPES: BEGIN OF ty_marc,
matnr TYPE char20,
werks TYPE char20,
disls TYPE char20,
ekgrp TYPE char20,
dismm TYPE char20,
END OF ty_marc,
tt_marc TYPE STANDARD TABLE OF ty_marc WITH EMPTY KEY.
METHODS: read RETURNING VALUE(tab) TYPE tt_marc,
extract_xml IMPORTING index TYPE i
xstring TYPE xstring
RETURNING VALUE(rv_xml_data) TYPE xstring.
ENDCLASS.
CLASS xlsx_reader IMPLEMENTATION.
METHOD read.
TYPES: BEGIN OF ty_row,
value TYPE string,
index TYPE abap_bool,
END OF ty_row,
BEGIN OF ty_worksheet,
row_id TYPE i,
row TYPE TABLE OF ty_row WITH EMPTY KEY,
END OF ty_worksheet,
BEGIN OF ty_si,
t TYPE string,
END OF ty_si.
DATA: data TYPE TABLE OF ty_si,
sheet TYPE TABLE OF ty_worksheet.
TRY.
DATA(xstring_xlsx) = cl_openxml_helper=>load_local_file( 'C:\marc.xlsx' ).
CATCH cx_openxml_not_found.
ENDTRY.
"Read the sheet XML
DATA(xml_sheet) = extract_xml( EXPORTING xstring = xstring_xlsx iv_xml_index = 2 ).
"Read the data XML
DATA(xml_data) = extract_xml( EXPORTING xstring = xstring_xlsx iv_xml_index = 3 ).
TRY.
* transforming structure into ABAP
CALL TRANSFORMATION zsheet
SOURCE XML xml_sheet
RESULT root = sheet.
* transforming data into ABAP
CALL TRANSFORMATION zxlsx_data
SOURCE XML xml_data
RESULT root = data.
CATCH cx_xslt_exception.
CATCH cx_st_match_element.
CATCH cx_st_ref_access.
ENDTRY.
* mapping structure and data
LOOP AT sheet ASSIGNING FIELD-SYMBOL(<fs_row>).
APPEND INITIAL LINE TO tab ASSIGNING FIELD-SYMBOL(<line>).
LOOP AT <fs_row>-row ASSIGNING FIELD-SYMBOL(<fs_cell>).
ASSIGN COMPONENT sy-tabix OF STRUCTURE <line> TO FIELD-SYMBOL(<fs_field>).
CHECK sy-subrc = 0.
<fs_field> = COND #( WHEN <fs_cell>-index = abap_false THEN <fs_cell>-value ELSE VALUE #( data[ <fs_cell>-value + 1 ]-t OPTIONAL ) ).
ENDLOOP.
ENDLOOP.
ENDMETHOD.
METHOD extract_xml.
TRY.
DATA(lo_package) = cl_xlsx_document=>load_document( iv_data = xstring ).
DATA(lo_parts) = lo_package->get_parts( ).
CHECK lo_parts IS BOUND AND lo_package IS BOUND.
DATA(lv_uri) = lo_parts->get_part( 2 )->get_parts( )->get_part( index )->get_uri( )->get_uri( ).
DATA(lo_xml_part) = lo_package->get_part_by_uri( cl_openxml_parturi=>create_from_partname( lv_uri ) ).
rv_xml_data = lo_xml_part->get_data( ).
CATCH cx_openxml_format cx_openxml_not_found.
ENDTRY.
ENDMETHOD.
ENDCLASS.
zsheet transformation:
<?sap.transform simple?>
<tt:transform xmlns:tt="http://www.sap.com/transformation-templates" template="main">
<tt:root name="root"/>
<tt:template name="main">
<worksheet xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships" xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006" xmlns:x14ac=
"http://schemas.microsoft.com/office/spreadsheetml/2009/9/ac" xmlns:xr="http://schemas.microsoft.com/office/spreadsheetml/2014/revision" xmlns:xr2="http://schemas.microsoft.com/office/spreadsheetml/2015/revision2" xmlns:xr3=
"http://schemas.microsoft.com/office/spreadsheetml/2016/revision3">
<tt:skip count="4"/>
<sheetData>
<tt:loop name="row" ref="root">
<row>
<tt:attribute name="r" value-ref="row_id"/>
<tt:loop name="cells" ref="$row.ROW">
<c>
<tt:cond><tt:attribute name="t" value-ref="index"/><tt:assign to-ref="index" val="C('X')"/></tt:cond>
<v><tt:value ref="value"/></v>
</c>
</tt:loop>
</row>
</tt:loop>
</sheetData>
<tt:skip count="2"/>
</worksheet>
</tt:template>
</tt:transform>
zxlsx_data transformation
<?sap.transform simple?>
<tt:transform xmlns:tt="http://www.sap.com/transformation-templates" template="main">
<tt:root name="ROOT"/>
<tt:template name="main">
<sst xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main">
<tt:loop name="line" ref=".ROOT">
<si>
<t>
<tt:value ref="t"/>
</t>
</si>
</tt:loop>
</sst>
</tt:template>
</tt:transform>
Here is how to call it:
START-OF-SELECTION.
DATA(reader) = NEW xlsx_reader( ).
DATA(marc) = reader->read( ).
The code is pretty self-explanatory, but let's put a couple of notes:
File sheet1.xml contains a special attribute t in each cell which denotes either the value should be treated as a literal or a reference to sharedStrings.xml
I used two simple transformations but XSLT can be used as well, possibly allowing you to reduce all XML stuff to single transformation
I deliberately used generic char20 types to be able to handle headers. If you wanna preserve native types, then you cannot read table header (skip the first line in sheet LOOP), because you'll receive type violation and dump. If you receive table without headers, then it is fine to declare structure with native types
If you don't want to use transformations then sXML is your friend. You can parse XML with classes as well, but ST transformation are considerably faster
With some additional effort you can make this snippet dynamic and parse XLSX with any structure
You can read more about this approach in this doc.

Related

How to exclude multiple values in OData call?

I am creating a SAPUI5 application. This application is connected to a backend SAP system via OData. In the SAPUI5 application I use a smart chart control. Out of the box the smart chart lets the user create filters for the underlying data. This works fine - except if you try to use multiple 'not equals' for one property. Is there a way to accomplish this?
I found out that all properties within an 'and_expression' (including nested or_expressions) must have unique name.
The reason why two parameters with the same property don't get parsed into the select options:
/IWCOR/CL_ODATA_EXPR_UTILS=>GET_FILTER_SELECT_OPTIONS takes the expression you pass and parses it into a table of select options.
The select option table returned is of type /IWCOR/IF_ODATA_TYPES=>EDM_SELECT_OPTION_T which is a HASHED TABLE .. WITH UNIQUE KEY property.
From: https://archive.sap.com/discussions/thread/3170195
The problem is that you cannot combine NE terms with OR. Because both parameters after the NE should not be shown in the result set.
So at the end the it_filter_select_options is empty and only the iv_filter_string is filled.
Is there a manual way of facing this problem (evaluation of the iv_filter_string) to handle multiple NE terms?
This would be an example request:
XYZ/SmartChartSet?$filter=(Category%20ne%20%27Smartphone%27%20and%20Category%20ne%20%27Notebook%27)%20and%20Purchaser%20eq%20%27CompanyABC%27%20and%20BuyDate%20eq%20datetime%272018-10-12T02%3a00%3a00%27&$inlinecount=allpages
Normally I want this to exclude items with the category 'Notebook' and 'Smartphone' from my result set that I retrieve from the backend.
If there is a bug inside /iwcor/cl_odata_expr_utils=>get_filter_select_options which makes it unable to treat multiple NE filters of the same component, and you cannot wait for an OSS. I would suggest to wrap it inside a new static method that will make the following logic (if you will be stuck with the ABAP implementation i would try to at least partially implement it when i get time):
Get all instances of <COMPONENT> ne '<VALUE>' inside a () (using REGEX).
Replace each <COMPONENT> with <COMPONENT>_<i> so there will be ( <COMPONENT>_1 ne '<VALUE_1>' and <COMPONENT>_2 ne '<VALUE_2>' and... <COMPONENT>_<n> ne '<VALUE_n>' ).
Call /iwcor/cl_odata_expr_utils=>get_filter_select_options with the modified query.
Modify the rt_select_options result by changing COMPONENT_<i> to <COMPONENT> again.
I can't find the source but I recall that multiple "ne" isn't supported. Isn't that the same thing that happens when you do multiple negatives in SE16, some warning is displayed?
I found this extract for Business ByDesign:
Excluding two values using the OR operator (for example: $filter=CACCDOCTYPE ne ‘1000’ or CACCDOCTYPE ne ‘4000’) is not possible.
The workaround I see is to select the Categories you actively want, not the ones you don't in the UI5 app.
I can also confirm that my code snippet I've used a long time for filtering also has the same problem...
* <SIGNATURE>---------------------------------------------------------------------------------------+
* | Instance Public Method ZCL_MGW_ABS_DATA->FILTERING
* +-------------------------------------------------------------------------------------------------+
* | [--->] IO_TECH_REQUEST_CONTEXT TYPE REF TO /IWBEP/IF_MGW_REQ_ENTITYSET
* | [<-->] CR_ENTITYSET TYPE REF TO DATA
* | [!CX!] /IWBEP/CX_MGW_BUSI_EXCEPTION
* | [!CX!] /IWBEP/CX_MGW_TECH_EXCEPTION
* +--------------------------------------------------------------------------------------</SIGNATURE>
METHOD FILTERING.
FIELD-SYMBOLS <lt_entityset> TYPE STANDARD TABLE.
ASSIGN cr_entityset->* TO <lt_entityset>.
CHECK: cr_entityset IS BOUND,
<lt_entityset> IS ASSIGNED.
DATA(lo_filter) = io_tech_request_context->get_filter( ).
/iwbep/cl_mgw_data_util=>filtering(
exporting it_select_options = lo_filter->get_filter_select_options( )
changing ct_data = <lt_entityset> ).
ENDMETHOD.

Saving a composite datawindow in PowerBuilder

Is there a way to save results in a composite datawindow as a text or excel spreadsheet? Powerbuilder states for composites the format to save it is PSReport. That doesn't work for what I'm trying to do. Is there any other workaround for this issue?
A composite datawindow may contain any number of nested datawindows. So if memory serves, you cannot save the entire composite datawindow as data in a meaningful way (SaveAs will just give you a one-line bit of meaningless data), but you CAN save each of the nested datawindows inside the composite.
Here is some PFC code I wrote (for inside a menu item) which makes a copy of a nested report and then executes a SaveAs (dialog in this case):
//must create a 'dummy' datawindow,
//and put the data on the nested report into it
u_dw ldw_Temp
Window lw_Parent
String ls_Syntax, ls_Error
If Not IsValid(i_dwo) Then
MessageBox('Unexpected Error', &
'The pointer to the datawindow object was invalid. Contact Systems.')
Return
End If
If i_dwo.Type = 'report' Then
//continue
Else
MessageBox('Unexpected Error', &
'The pointer did not refer to a report. Contact Systems.')
Return
End If
If idw_Parent.of_GetParentWindow(lw_Parent) = 1 Then
ls_Syntax = i_dwo.object.datawindow.syntax
If lw_Parent.OpenUserObject(ldw_Temp) = 1 Then
If ldw_Temp.Create(ls_Syntax,ls_Error) = 1 Then
ldw_Temp.Object.Data.Primary = i_dwo.Object.Data.Primary
ldw_Temp.Event pfc_SaveAs()
Else
If IsNull(ls_Error) Or ls_Error = '' Then ls_Error = '<unknown error>'
MessageBox('Error','Error creating datawindow object: ' + ls_Error)
End If
lw_Parent.CloseUserObject(ldw_Temp)
Else
MessageBox('Error','Error creating datawindow control on ' + lw_Parent.ClassName())
End If
Else
MessageBox('Error','Unable to obtain pointer to the parent window.')
End If
Basically this code gets the syntax of the datawindow object underlying the nested report (ls_Syntax = i_dwo.object.datawindow.syntax above), then creates a datawindow control on the parent form and loads that syntax into it, and then copies the data from the one into the other. Finally it calls SaveAs on the copy, and the user is presented with a dialog asking where they want to save the data from the nested report and in what format.
You could automate this so that it saves each nested report as a separate file, and then if you like you could automate the formation of those separate files into a single file (sheets inside a workbook, appended text files, etc).
There are other ways to do what you are asking, depending on what exactly you are trying to accomplish.
The i_dwo variable was loaded in the right-mouse up even of the datawindow control (the dwo event variable there).
That may get you running, please ask questions if you have any.

Unable to append a sheet using OpenXml with F# (FSharp)

The CreateSpreadsheetWorkbook example method from the OpenXml documentation does translate directly to F#. The problem seems to be the Append method of the Sheets object. The code executes without error, but the resulting xlsx file is missing the inner Xml which should have been appended, and the file is unreadable by Excel. I suspect the problem stems from the conversion of functional F# structures into a System.Collections type, but I do not have direct evidence for this.
I have run similar code in C# and VB.NET (i.e. the documentation example) and it executes perfectly and creates a readable, complete xlsx file.
I know that I could deal with the XML directly, but I would like to understand the nature of the mismatch between F# and OpenXml. Any suggestions?
The code is almost directly from the example:
namespace OpenXmlLib
open System
open DocumentFormat
open DocumentFormat.OpenXml
open DocumentFormat.OpenXml.Packaging
open DocumentFormat.OpenXml.Spreadsheet
module OpenXmlXL =
// this function overwrites an existing file without warning!
let CreateSpreadsheetWorkbook (filepath: string) =
// Create a spreadsheet document by supplying the filepath.
// By default, AutoSave = true, Editable = true, and Type = xlsx.
let spreadsheetDocument = SpreadsheetDocument.Create(filepath, SpreadsheetDocumentType.Workbook)
// Add a WorkbookPart to the document.
let workbookpart = spreadsheetDocument.AddWorkbookPart()
workbookpart.Workbook <- new Workbook()
// Add a WorksheetPart to the WorkbookPart.
let worksheetPart = workbookpart.AddNewPart<WorksheetPart>()
worksheetPart.Worksheet <- new Worksheet(new SheetData())
// Add Sheets to the Workbook.
let sheets = spreadsheetDocument.WorkbookPart.Workbook.AppendChild<Sheets>(new Sheets())
// Append a new worksheet and associate it with the workbook.
let sheet = new Sheet()
sheet.Id <- stringValue(spreadsheetDocument.WorkbookPart.GetIdOfPart(worksheetPart))
//Console.WriteLine(sheet.Id.Value)
sheet.SheetId <- UInt32Value(1u)
// Console.WriteLine(sheet.SheetId.Value)
sheet.Name <- StringValue("TestSheet")
//Console.WriteLine(sheet.Name.Value)
sheets.Append (sheet)
// Console.WriteLine("Sheets: {0}", sheets.InnerXml.ToString())
workbookpart.Workbook.Save()
spreadsheetDocument.Close()
The sheet is created, but empty:
sheet.xml:
<?xml version="1.0" encoding="utf-8" ?>
<x:worksheet xmlns:x="http://schemas.openxmlformats.org/spreadsheetml/2006/main" />
workbook.xml:
<?xml version="1.0" encoding="utf-8" ?>
- <x:workbook xmlns:x="http://schemas.openxmlformats.org/spreadsheetml/2006/main">
- <x:sheets>
<x:sheet name="TestSheet" sheetId="1" r:id="R263eb6f245a2497e" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships" />
</x:sheets>
</x:workbook>
The problem is very subtle, and is in your calls to the Worksheet constructor and the Sheets.Append method. Both of these methods are overloaded, and can take either a seq<OpenXmlElement> or any number of individual OpenXmlElements (via a [<System.ParamArray>]/params array). The twist is that the OpenXmlElement type itself implements the seq<OpenXmlElement> interface.
In C#, when you call new Worksheet(new SheetData()), the compiler's overload resolution picks the second of the overloads, implicitly creating a one-element array containing the SheetData value. However, in F#, since the SheetData class implements IEnumerable<OpenXmlElement>, the first overload is chosen, which creates a new WorkSheet by enumerating the contents of the SheetData, which is not what you want.
To fix this, you need to set up your calls so that they use the other overload (first example below) or explicitly create a singleton sequence (second example below):
worksheetPart.Worksheet <- new Worksheet(new SheetData() :> OpenXmlElement)
...
sheets.Append([sheet :> OpenXmlElement])

Best way of storing an "array of records" at design-time

I have a set of data that I need to store at design-time to construct the contents of a group of components at run-time.
Something like this:
type
TVulnerabilityData = record
Vulnerability: TVulnerability;
Name: string;
Description: string;
ErrorMessage: string;
end;
What's the best way of storing this data at design-time for later retrieval at run-time? I'll have about 20 records for which I know all the contents of each "record" but I'm stuck on what's the best way of storing the data.
The only semi-elegant idea I've come up with is "construct" each record on the unit's initialization like this:
var
VulnerabilityData: array[Low(TVulnerability)..High(TVulnerability)] of TVulnerabilityData;
....
initialization
VulnerabilityData[0].Vulnerability := vVulnerability1;
VulnerabilityData[0].Name := 'Name of Vulnerability1';
VulnerabilityData[0].Description := 'Description of Vulnerability1';
VulnerabilityData[0].ErrorMessage := 'Error Message of Vulnerability1';
VulnerabilityData[1]......
.....
VulnerabilityData[20]......
Is there a better and/or more elegant solution than this?
Thanks for reading and for any insights you might provide.
You can also declare your array as consts and initialize it...
const
VulnerabilityData: array[Low(TVulnerability)..High(TVulnerability)] of TVulnerabilityData =
(
(Vulnerability : vVulnerability1; Name : Name1; Description : Description1; ErrorMessage : ErrorMessage1),
(Vulnerability : vVulnerability2; Name : Name2; Description : Description2; ErrorMessage : ErrorMessage2),
[...]
(Vulnerability : vVulnerabilityX; Name : NameX; Description : DescriptionX; ErrorMessage : ErrorMessageX)
)
);
I don't have an IDE on this computer to double check the syntax... might be a comma or two missing. But this is how you should do it I think.
not an answer but may be a clue: design-time controls can have images and other binary data associated with it, why not write your data to a resource file and read from there? iterating of course, to make it simpler, extensible and more elegant
The typical way would be a file, either properties style (a=b\n on each line) cdf, xml, yaml (preferred if you have a parser for it) or a database.
If you must specify it in code as in your example, you should start by putting it in something you can parse into a simple format then iterate over it. For instance, in Java I'd instantiate an array:
String[] vals=new String[]{
"Name of Vulnerability1", "Description of Vulnerability1", "Error Message of Vulnerability1",
"Name of Vulnerability2", ...
}
This puts all your data into one place and the loop that reads it can easily be changed to read it from a file.
I use this pattern all the time to create menus and for other string-intensive initialization.
Don't forget that you can throw some logic in there too! For instance, with menus I will sometimes create them using data like this:
"^File", "Open", "Close", "^Edit", "Copy", "Paste"
As I'm reading this in I scan for the ^ which tells the code to make this entry a top level item. I also use "+Item" to create a sub-group and "-Item" to go back up to the previous group.
Since you are completely specifying the format you can add power later. For instance, if you coded menus using the above system, you might decide at first that you could use the first letter of each item as an accelerator key. Later you find out that File/Close conflicts with another "C" item, you can just change the protocol to allow "Close*e" to specify that E should be the accelerator. You could even include ctrl-x with a different character. (If you do shorthand data entry tricks like this, document it with comments!)
Don't be afraid to write little tools like this, in the long run they will help you immensely, and I can turn out a parser like this and copy/paste the values into my code faster than you can mold a text file to fit your example.

DBF Large Char Field

I have a database file that I beleive was created with Clipper but can't say for sure (I have .ntx files for indexes which I understand is what Clipper uses). I am trying to create a C# application that will read this database using the System.Data.OleDB namespace.
For the most part I can sucessfully read the contents of the tables there is one field that I cannot. This field called CTRLNUMS that is defined as a CHAR(750). I have read various articles found through Google searches that suggest field larger than 255 chars have to be read through a different process than the normal assignment to a string variable. So far I have not been successful in an approach that I have found.
The following is a sample code snippet I am using to read the table and includes two options I used to read the CTRLNUMS field. Both options resulted in 238 characters being returned even though there is 750 characters stored in the field.
Here is my connection string:
Provider=Microsoft.Jet.OLEDB.4.0;Data Source=c:\datadir;Extended Properties=DBASE IV;
Can anyone tell me the secret to reading larger fields from a DBF file?
using (OleDbConnection conn = new OleDbConnection(connectionString))
{
conn.Open();
using (OleDbCommand cmd = new OleDbCommand())
{
cmd.Connection = conn;
cmd.CommandType = CommandType.Text;
cmd.CommandText = string.Format("SELECT ITEM,CTRLNUMS FROM STUFF WHERE ITEM = '{0}'", stuffId);
using (OleDbDataReader dr = cmd.ExecuteReader())
{
if (dr.Read())
{
stuff.StuffId = dr["ITEM"].ToString();
// OPTION 1
string ctrlNums = dr["CTRLNUMS"].ToString();
// OPTION 2
char[] buffer = new char[750];
int index = 0;
int readSize = 5;
while (index < 750)
{
long charsRead = dr.GetChars(dr.GetOrdinal("CTRLNUMS"), index, buffer, index, readSize);
index += (int)charsRead;
if (charsRead < readSize)
{
break;
}
}
}
}
}
}
You can find a description of the DBF structure here: http://www.dbf2002.com/dbf-file-format.html
What I think Clipper used to do was modify the Field structure so that, in Character fields, the Decimal Places held the high-order byte of the size, so Character field sizes were really 256*Decimals+Size.
I may have a C# class that reads dbfs (natively, not ADO/DAO), it could be modified to handle this case. Let me know if you're interested.
Are you still looking for an answer? Is this a one-off job or something that needs doing regularly?
I have a Python module that is primarily intended to extract data from all kinds of DBF files ... it doesn't yet handle the length_high_byte = decimal_places hack, but it's a trivial change. I'd be quite happy to (a) share this with you and/or (b) get a copy of such a DBF file for testing.
Added later: Extended-length feature added, and tested against files I've created myself. Offer to share code with anyone who would like to test it still stands. Still interested in getting some "real" files myself for testing.
3 suggestions that might be worth a shot...
1 - use Access to create a linked table to the DBF file, then use .Net to hit the table in the access database instead of going direct to the DBF.
2 - try the FoxPro OLEDB provider
3 - parse the DBF file by hand. Example is here.
My guess is that #1 should work the easiest, and #3 will give you the opportunity to fine tune your cussing skills. :)

Resources