Incorrect reading of XLS columns

I have a Delphi 10.1 program which uses a very recent version of Flexcel to read Excel spreadsheets. Although I also have Flexcel 3 installed and XLSX adapter available, I have been moving to the new way of working with Flexcel.

In this case, everything works fine when opening and reading a XLSX file for any of its contained tabsheets and for any order I specify to read the coloumns. The tabsheet values displayed on a TDrawGrid are also correct for the selected tabsheet. But, when I open a XLS file things don't work properly. If I stay on the first tabsheet, then things work okay but if I change to another tabsheet, the tDrawGrid shows the correct tabsheet values from the spreadsheet that has been opened but the values that get read in are actually always taken from the first tabsheet, not the alternative that I have chosen.

Is this a bug in the software or am I doing something wrong when selecting the tabsheet to be read? I am assuming the former because it works properly for XLSX but not for XLS.

Thanks
   Bruce.

Hi,
Is this with TFlexCelGrid?  Do you have a simple example code to show what you are seeing?

Hi Adrian. Thanks for the very rapid response.

No, I currently use a TDrawGrid, copied from one of your demos from a few years ago.

The code for the entire form is as follows:

unit PDF_ShtIm3;

interface

uses
  Winapi.Windows, Winapi.Messages, System.SysUtils, System.Variants, System.Classes, Vcl.Graphics,
  Vcl.Controls, Vcl.Forms, Vcl.Dialogs, Vcl.StdCtrls, Vcl.Buttons, Vcl.ExtCtrls,
  Vcl.Grids, Vcl.ComCtrls, ToolWin,
  VCL.FlexCel.Core, FlexCel.XlsAdapter,
  UCellReader, USparseArray;

type
  TfmSheetImport = class(TForm)
    pControl: TPanel;
    bbOpenSheet: TBitBtn;
    bbCancel: TBitBtn;
    pDefinitions: TPanel;
    gbDefineFields: TGroupBox;
    Label9: TLabel;
    Panel1: TPanel;
    Label5: TLabel;
    Label1: TLabel;
    Label6: TLabel;
    Label7: TLabel;
    Label8: TLabel;
    eDataColStr: TEdit;
    eSigmaColStr: TEdit;
    eUnitAgeColStr: TEdit;
    rgUncertainties: TRadioGroup;
    eDefaultMinimum: TEdit;
    Panel2: TPanel;
    bbImport: TBitBtn;
    gbDefineRows: TGroupBox;
    Label2: TLabel;
    Label3: TLabel;
    Label4: TLabel;
    meFromRow: TEdit;
    meToRow: TEdit;
    cbOmitNegatives: TCheckBox;
    sbSheet: TStatusBar;
    Splitter1: TSplitter;
    pData: TPanel;
    OpenDialogSprdSheet: TOpenDialog;
    SheetData: TDrawGrid;
    lShowOnly50Rows: TLabel;
    Label10: TLabel;
    eExtraVarColStr: TEdit;
    Label11: TLabel;
    Label12: TLabel;
    eExtraCutoff: TEdit;
    Label13: TLabel;
    Label14: TLabel;
    Label15: TLabel;
    procedure bbOpenSheetClick(Sender: TObject);
    procedure bbCancelClick(Sender: TObject);
    procedure SheetDataDrawCell(Sender: TObject; ACol, ARow: Integer;
      Rect: TRect; State: TGridDrawState);
    procedure bbImportClick(Sender: TObject);
    procedure FormShow(Sender: TObject);
    procedure rgUncertaintiesClick(Sender: TObject);
  private
    { Private declarations }
    CellData: TSparseCellArray; //For this demo we will store the data here, in your application you should use the data as you wish, send it to a db, etc.
    procedure OpenFile(const FileName: string);
    function GetCellValue(const aCol, aRow: integer): string;
    function ConvertCol2Int(AnyString : string) : integer;
  public
    { Public declarations }
    destructor Destroy; override;
  end;

var
  fmSheetImport: TfmSheetImport;

implementation

{$R *.dfm}

uses
  AllSorts, PDF_varb, PDF_dm;

destructor TfmSheetImport.Destroy;
begin
  CellData.Free;
  inherited;
end;

procedure TfmSheetImport.FormShow(Sender: TObject);
begin
  lShowOnly50Rows.Visible := ShowOnly50Rows;
  Splitter1.Visible := true;
  pDefinitions.Visible := false;
  eDataColStr.Text := UpperCase(DataColStr);
  eSigmaColStr.Text := UpperCase(SigmaColStr);
  eUnitAgeColStr.Text := UpperCase(UnitAgeColStr);
  eExtraVarColStr.Text := UpperCase(ExtraVarColStr);
  meFromRow.Text := '2';
  meToRow.Text := '3';
  bbImport.Enabled := true;
  if (dmPDF.SigmaFactor = 1.0) then rgUncertainties.ItemIndex := 0;
  if (dmPDF.SigmaFactor = 1.0/2.0) then rgUncertainties.ItemIndex := 1;
  if (dmPDF.SigmaFactor = 1.0/1.96) then rgUncertainties.ItemIndex := 2;
  //bbOpenSheetClick(Sender);
  OpenDialogSprdSheet.FileName := '';
end;

procedure TfmSheetImport.bbCancelClick(Sender: TObject);
begin
  dmPDF.DataImported := true;
  Close;
end;

function TfmSheetImport.ConvertCol2Int(AnyString : string) : integer;
var
  itmp    : integer;
  tmpStr  : string;
  tmpChar : char;
begin
    AnyString := UpperCase(AnyString);
    tmpStr := AnyString;
    ClearNull(tmpStr);
    Result := 0;
    if (length(tmpStr) = 2) then
    begin
      tmpChar := tmpStr[1];
      itmp := (ord(tmpChar)-64)*26;
      tmpChar := tmpStr[2];
      Result := itmp+(ord(tmpChar)-64);
    end else
    begin
      tmpChar := tmpStr[1];
      Result := (ord(tmpChar)-64);
    end;
end;


procedure TfmSheetImport.bbImportClick(Sender: TObject);
var
  j, k     : integer;
  iCode : integer;
  i : integer;
  FromRow, ToRow : integer;
  tmpStr : string;
  tmpDataStr, tmpSigmaStr, tmpUnitAgeStr,
  tmpExtraVarStr : string;
  tmpData, tmpSigma, tmpUnitAge,
  tmpExtraVar : double;
  WasSuccessful : boolean;
  tmpDataValue, tmpSigmaValue, tmpUnitAgeValue,
  tmpExtraVarValue : double;
  Xls: TExcelFile;
  Formatted : boolean;
  v : TCellValue;
begin
  case rgUncertainties.ItemIndex of
    0 : dmPDF.SigmaFactor := 1.0;
    1 : dmPDF.SigmaFactor := 1.0/2.0;
    2 : dmPDF.SigmaFactor := 1.0/1.96;
  end;
  dmPDF.DataImported := false;
  iCode := 1;
  repeat
    tmpStr := eDefaultMinimum.Text;
    Val(tmpStr, DefaultMinimum, iCode);
    if (iCode = 0) then
    begin
      tmpStr := meFromRow.Text;
      Val(tmpStr, FromRow, iCode);
    end else
    begin
      ShowMessage('Incorrect value entered for Default Minimum');
      Exit;
    end;
    //tmpStr := meFromRow.Text;
    //Val(tmpStr, FromRow, iCode);
    if (iCode = 0) then
    begin
      tmpStr := meToRow.Text;
      Val(tmpStr, ToRow, iCode);
    end else
    begin
      ShowMessage('Incorrect value entered for From row');
      Exit;
    end;
    if (iCode = 0) then
    begin
      if (ToRow >= FromRow) then iCode := 0
                            else iCode := -1;
    end else
    begin
      ShowMessage('Incorrect value entered for To row');
      Exit;
    end;
    if (iCode <> 0)
      then begin
        ShowMessage('Incorrect values entered for import specifications');
        Exit;
      end;
  until (iCode = 0);
  tmpStr := UpperCase(eDataColStr.Text);
  DataCol := ConvertCol2Int(tmpStr);
  tmpStr := UpperCase(eSigmaColStr.Text);
  SigmaCol := ConvertCol2Int(tmpStr);
  tmpStr := UpperCase(eUnitAgeColStr.Text);
  if (Trim(tmpStr) <> '0') then
  begin
    UnitAgeCol := ConvertCol2Int(tmpStr);
  end else
  begin
    UnitAgeCol := 37000;
  end;
  tmpStr := UpperCase(eExtraVarColStr.Text);
  if (Trim(tmpStr) <> '0') then
  begin
    ExtraVarCol := ConvertCol2Int(tmpStr);
  end else
  begin
    ExtraVarCol := 37000;
  end;

  //ShowMessage(IntToStr(DataCol)+'   '+IntToStr(SigmaCol)+'   '+IntToStr(UnitAgeCol));
  //ShowMessage(IntToStr(FromRow)+'   '+IntToStr(ToRow));
   //Open the Excel file.
  Xls := TXlsFile.Create(false);
  try

    //By default, FlexCel returns the formula text for the formulas, besides its calculated value.
    //If you are not interested in formula texts, you can gain a little performance by ignoring it.
    //This also works in non virtual mode.
    xls.IgnoreFormulaText := true; //bme - hard code this for this situation

    xls.VirtualMode := false;
    try
      xls.Open(OpenDialogSprdSheet.FileName);
    finally
    end;

  with dmPDF do
  begin
    // import selected data
    sbSheet.SimpleText := 'Importing data from selected columns';
    sbSheet.Refresh;

    // do for all rows in data spreadsheet
    // repeat through variables
    // read data value

    dmPDF.cdsRawData.DisableControls;
    j := 1;
    for i := FromRow to ToRow do
    begin
      //if Formatted then
      //begin
      //  tmpDataStr := trim(Xls.GetStringFromCell(i,DataCol));
      //  tmpSigmaStr := trim(Xls.GetStringFromCell(i,SigmaCol));
      //  tmpUnitAgeStr := trim(Xls.GetStringFromCell(i,UnitAgeCol));
      //end
      //else
      //begin
        v := Xls.GetCellValue(i,DataCol);
        tmpDataStr := v.ToString;
        v := Xls.GetCellValue(i,SigmaCol);
        tmpSigmaStr := v.ToString;
        if (UnitAgeCol < 37000) then
        begin
          v := Xls.GetCellValue(i,UnitAgeCol);
          tmpUnitAgeStr := v.ToString;
        end else
        begin
          tmpUnitAgeStr := '-999.0';
        end;
        if (ExtraVarCol < 37000) then
        begin
          v := Xls.GetCellValue(i,ExtraVarCol);
          tmpExtraVarStr := v.ToString;
        end else
        begin
          tmpExtraVarStr := '';
        end;
      //end;
      //if (i < 10) then
      //begin
      //  ShowMessage(IntToStr(i)+'   '+IntToStr(DataCol)+'   '+IntToStr(SigmaCol)+'   '+IntToStr(UnitAgeCol)+'   '+tmpDataStr+'   '+tmpSigmaStr+'   '+tmpUnitAgeStr+'');
      //end;
      dmPDF.cdsRawData.Append;
      dmPDF.cdsRawDatai.AsInteger := i;
      try
        if (tmpDataStr <> '') then
        begin
          try
            //dmPDF.cdsRawDatai.AsInteger := i;
            dmPDF.cdsRawDataData.AsString := tmpDataStr;
          except
          end;
        end else
        begin
          try
            //dmPDF.cdsRawDatai.AsInteger := i;
            dmPDF.cdsRawDataData.AsString := '-999.0';
          except
          end;
        end;
      except
      end;
      try
        if (tmpSigmaStr <> '') then
        begin
          try
            Val(tmpSigmaStr,tmpSigmaValue,iCode);
            if (iCode = 0) then
            begin
              tmpSigmaValue := tmpSigmaValue * dmPDF.SigmaFactor;
              if (tmpSigmaValue < DefaultMinimum) then tmpSigmaValue := DefaultMinimum;
              tmpSigmaStr := FormatFloat('######0.000000',tmpSigmaValue);
            end else
            begin
              tmpSigmaValue := tmpSigmaValue * dmPDF.SigmaFactor;
              if (tmpSigmaValue < DefaultMinimum) then tmpSigmaValue := DefaultMinimum;
              tmpSigmaStr := FormatFloat('######0.000000',tmpSigmaValue);
            end;
            dmPDF.cdsRawDataSigma.AsString := tmpSigmaStr;
            //Val(tmpSigmaStr,DataArray[j,2],iCode);
          except
          end;
        end else
        begin
          tmpSigmaValue := DefaultMinimum * dmPDF.SigmaFactor;
          tmpSigmaStr := FormatFloat('######0.000000',tmpSigmaValue);
        end;
        if (tmpUnitAgeStr <> '') then
        begin
          try
            Val(tmpUnitAgeStr,tmpUnitAgeValue,iCode);
            if (iCode = 0) then
            begin
              //tmpUnitAgeValue := tmpUnitAgeValue;
              tmpUnitAgeStr := FormatFloat('######0.000',tmpUnitAgeValue);
            end else
            begin
              tmpUnitAgeValue := 0.0;
              tmpUnitAgeStr := FormatFloat('######0.000',tmpUnitAgeValue);
            end;
            dmPDF.cdsRawDataUnitAge.AsString := tmpUnitAgeStr;
          except
          end;
        end else
        begin
           tmpUnitAgeValue := -11.0;
           tmpUnitAgeStr := FormatFloat('######0.000',tmpUnitAgeValue);
        end;
        if (tmpExtraVarStr <> '') then
        begin
          try
            Val(tmpExtraVarStr,tmpExtraVarValue,iCode);
            if (iCode = 0) then
            begin
              //tmpUnitAgeValue := tmpUnitAgeValue;
              tmpExtraVarStr := FormatFloat('######0.000',tmpExtraVarValue);
            end else
            begin
              tmpExtraVarValue := 0.0;
              tmpExtraVarStr := FormatFloat('######0.000',tmpExtraVarValue);
            end;
            dmPDF.cdsRawDataExtra.AsString := tmpExtraVarStr;
          except
          end;
        end else
        begin
           tmpExtraVarValue := -999.0;
           tmpExtraVarStr := FormatFloat('######0.000',tmpExtraVarValue);
           dmPDF.cdsRawDataExtra.AsString := tmpExtraVarStr;
        end;
      except
      end;
    end;
  end;
  finally
    Xls.Free;
  end;
  if (dmPDF.cdsRawData.RecordCount > 2) then dmPDF.DataImported := true;
  dmPDF.cdsRawData.First;
  dmPDF.cdsRawData.EnableControls;
  sbSheet.SimpleText := 'Finished importing all data';
  sbSheet.Refresh;
  Application.ProcessMessages;
  //f dmPDF.DataImported then  ShowMessage('Data have been imported');
  DataColStr := UpperCase(eDataColStr.Text);
  SigmaColStr := UpperCase(eSigmaColStr.Text);
  UnitAgeColStr := UpperCase(eUnitAgeColStr.Text);
  ExtraVarColStr := UpperCase(eExtraVarColStr.Text);
  Close;
end;

procedure TfmSheetImport.bbOpenSheetClick(Sender: TObject);
var
  tmpStr    : string;
  i         : integer;
begin
  OpenDialogSprdSheet.InitialDir := DataPath;
  if not OpenDialogSprdSheet.Execute then Exit;
  DataPath := ExtractFilePath(OpenDialogSprdSheet.FileName);
  OpenFile(OpenDialogSprdSheet.FileName);
end;

procedure TfmSheetImport.OpenFile(const FileName: string);
var
  StartOpen: TDateTime;
  EndOpen: TDateTime;
  StartSheetSelect, EndSheetSelect: TDateTime;
  Xls: TExcelFile;
  CellReader: TCellReader;
begin
  pData.Visible := true;
   //Open the Excel file.
  Xls := TXlsFile.Create(false);
  try
    FreeAndNil(CellData);
    CellData := TSparseCellArray.Create;
    StartOpen := Now;

    //By default, FlexCel returns the formula text for the formulas, besides its calculated value.
    //If you are not interested in formula texts, you can gain a little performance by ignoring it.
    //This also works in non virtual mode.
    xls.IgnoreFormulaText := true; //bme - hard code this for this situation since just reading cell values

    xls.VirtualMode := true;
    CellReader := TCellReader.Create(ShowOnly50Rows,CellData,Formatted);
    try
      xls.VirtualCellStartReading := CellReader.OnStartReading;
      xls.VirtualCellRead := CellReader.OnCellRead;

      xls.Open(FileName);
      StartSheetSelect := CellReader.StartSheetSelect;
      EndSheetSelect := CellReader.EndSheetSelect;
    finally
      CellReader.Free;
    end;
    EndOpen := Now;
  finally
    Xls.Free;
  end;

  if CellData <> nil then
  begin
    SheetData.ColCount := CellData.ColCount + 1;
    SheetData.RowCount := CellData.RowCount + 1;
  end
  else
  begin
    SheetData.ColCount := 1;
    SheetData.RowCount := 1;
  end;
  if (SheetData.ColCount > 1) and (SheetData.RowCount > 1) then
  begin
    SheetData.FixedRows := 1;
    SheetData.FixedCols := 1;
  end;
  SheetData.Invalidate;
  pDefinitions.Visible := true;
end;

procedure TfmSheetImport.rgUncertaintiesClick(Sender: TObject);
begin
  case rgUncertainties.ItemIndex of
    0 : dmPDF.SigmaFactor := 1.0;
    1 : dmPDF.SigmaFactor := 1.0/2.0;
    2 : dmPDF.SigmaFactor := 1.0/1.96;
  end;
end;

procedure TfmSheetImport.SheetDataDrawCell(Sender: TObject; ACol, ARow: Integer;
  Rect: TRect; State: TGridDrawState);
begin
  SheetData.Canvas.TextRect(Rect, Rect.Left + 2, Rect.Top + 2, GetCellValue(ACol, ARow));
end;

function TfmSheetImport.GetCellValue(const aCol, aRow: integer): string;
begin
if ACol = 0 then
  begin
    if ARow = 0 then exit('');
    exit (IntToStr(aRow));
  end;
  if ARow = 0 then exit(TCellAddress.EncodeColumn(aCol));
  if CellData = nil then exit('');
  exit(CellData.GetValue(ARow, aCol));
end;


end.

Thanks for the file. It indeed seems to come from here:
http://www.tmssoftware.biz/flexcel/doc/vcl/samples/delphi/api/virtual-mode/index.html

But I am not clear if you are using Virtual mode or not, since the code has 2 different buttons: bbImportClick (which doesn't use virtual mode) and bbOpenSheetClick (which does)

Also I don't see any mention of ActiveSheet in this code, so I assume it is set somewhere else.

My first step would be to see if there isn't anything wrong with the files. So you can try the demos:
Without virtual mode:
http://www.tmssoftware.biz/flexcel/doc/vcl/samples/delphi/api/reading-files/index.html
and with virtual mode:
http://www.tmssoftware.biz/flexcel/doc/vcl/samples/delphi/api/virtual-mode/index.html

Do those demos work with your xls file? If they don't please send me the xls file to adrian@tmossoftware.com and we will see what is happening. But my guess is that they will work: changing the activesheet is very basic functionality and if it was broken for xls we would know by now, I am sure someone else would have already filled a bug :)

If the demos work, the next step would be to know if you want virtual mode or not. Virtual mode discards the values read as it is reading them, so after opening a file with virtual mode the xlsfile used to open it is empty. This is why the virtual mode demo loads the data in a TSparseArray, but for this case ti seems overkill: Just let FlexCel load the data in its own memory by not using Virtual Mode. If you are using virtual mode, probably the problem comes from not refreshing the sparsearray with the data. But well, I am not sure if you are using virtual mode or not :)

Just for the record, you can read more about virtual mode here:
http://www.tmssoftware.biz/flexcel/doc/vcl/guides/performance-guide.html#virtual-mode
But in this case, I don't think it is worth. You need to load the data into memory to draw the grid, so you are not gaining anything by defining your own TSparseArray structure and loading it there.
But well, let's not get ahead of ourselves. Please confirm me if the demos work with your files first, so we can know if it is an issue with the files themselves. And let me know if you are using virtual mode or not.

Thanks.

I have used the demo program to confirm that both with and without virtualmode, both XLS and XLSX versions of my spreadsheet open fine. There is no problem with the spreadsheets themselves.

I use VirtualMode to initially open a spreadsheet and to select which tabsheet I want. In this process I only display the first 50 records. This is done via procedure OpenFile.

Then I open the file again but not in virtualmode in procedure ImportFile. Here, I step through from FromRow to ToRow, reading in values from each of the specified columns (there are 4 columns I am interested in) and store them in a clientdataset for subsequent data processing.

I suspect that my problem relates to ActiveSheet not being set and that XLS files behave differently to XLSX files. I had assumed that, having selected the appropriate tabsheet as part of the OpenFile process, that would be the tabsheet with focus for the non-virtualmode ImportFile procedure. For a XLSX file this is the way it works. It is just for XLS files that it does not work.

The reason I have done things this way with initial use of VirtualMode and then non VirtualMode is that my files can potentially be very large (hundreds of thousands of rows assuming I use XLSX format) and I want to only show a few rows initially while deciding which columns hold the data of interest. Once this is defined, I thought I had to open the entire file to be able to read all the rows I want.

Thanks
   Bruce


I guess another question now, is how does one determine and store which tabsheet one wants to process. I have tried assigning the name and number of the tabsheet to private variables with statements both before and after StartSheetSelect and EndSheetSelect but neither provides anything other than the first tabsheet. In addition, if I do this and then specify the ActiveSheet in the Import part of the process, I stop the XLSX version of things from working as it used to.

Hi,
While internally xls and xlsx files are completely different
animals and that could cause some disparity in the external behavior, I
can't see it here.

It seems like I am missing something. But to
make this simpler to follow, I've made a simple app which does what I
think your app is doing:
http://www.tmssoftware.biz/flexcel/samples/virtmode.zip

It has 3 buttons:
1)the first checkbox allows to switch between xlsx and xls test files.
2)The first button loads all the sheets in virtual mode (50 rows) and loads them into the grid.
3)The second button opens the file again in normal mode, and shows the cell A1 of the selected tabsheet.

I
am not seeing any difference between xls or xlsx, and I can't really
imagine how it could be. The 2 xlsxfile instances are different, so the
activesheet is not shared between them. But if I assign the activesheet
of the xls/x file I opened in virtual mode to the tab sheet index, then
use the tabsheet index to set the activesheet in the second xlsfile
object, it works as expected, in both xls and xlsx.