corrupt data from FindNextFileW

K

krazymike

This code is in an Access Database module (VBA). It builds an index
of all files and directory in a given tree.

Sorry, some elements of the paths have been changed in this post due
to some of my firm's policies.

One such file has the path: "\\server\data\shared\username\New
Folderaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaadfasdfasdfasdifuasdiofuy
asdkofuh asdkjfjh askldjfh askdfh askdjfh askldjfh askdjfhasd
\adsfdkjfh askdjfh asldkjfh asdklf hasdfkljashdlfjksdfh.txt" - yes i
made that path intentionally to test long filepaths. That one's 309
characters.

Every file indexed after that one seems to inherit stray chars from
that one. "\\server\data\shared\username
\CLX4IndyGoHelp.chmaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaadfasdfasdfasdifuasdiofuy
asdkofuh asdkjfjh askldjfh askdfh askdjfh askldjfh askdjfhasd" Should
be "\\server\data\shared\username\CLX4IndyGoHelp.chm"

I can't see that I'm doing anything wrong. I'm guessing this is due
to some object or element not getting reinitialized before being
reused, but which? Any thoughts would be appreciated.

Portions of this code were inspired by a post from Karl E. Peterson.

Here's the code:

Option Compare Database
Option Explicit

Private Declare Function FindFirstFile Lib "kernel32" Alias
"FindFirstFileW" _
(ByVal lpFileName As Long, lpFindFileData As WIN32_FIND_DATA) As Long
Private Declare Function FindNextFile Lib "kernel32" Alias
"FindNextFileW" (ByVal _
hFindFile As Long, lpFindFileData As WIN32_FIND_DATA) As Long
Private Declare Function FindClose Lib "kernel32" (ByVal hFindFile
As Long) As _
Long

Private Const INVALID_FILE_ATTRIBUTES As Long = -1&
Private Const INVALID_HANDLE_VALUE As Long = -1&

Private Type FILETIME
dwLowDateTime As Long
dwHighDateTime As Long
End Type

Private Type WIN32_FIND_DATA
dwFileAttributes As Long
ftCreationTime As FILETIME
ftLastAccessTime As FILETIME
ftLastWriteTime As FILETIME
nFileSizeHigh As Long
nFileSizeLow As Long
dwReserved0 As Long
dwReserved1 As Long
cFileName(0 To 519) As Byte
cAlternate(0 To 27) As Byte
End Type
Dim rsF As Recordset, rsD As Recordset

Public Sub Main()
Dim src As String
Set rsD = CurrentDb.OpenRecordset("dir_list")
Set rsF = CurrentDb.OpenRecordset("file_list")
Do While src = ""
src = Prod_Path
Loop
Call getEm(src)
rsF.Close
rsD.Close
End Sub

Sub getEm(ByVal path As String)
Dim hFind As Long, src As String
Dim nFound As Long, temP As String
Dim wfd As WIN32_FIND_DATA

src = ""
src = path

If Right(path, 1) = "\" Then path = Left(path, Len(path ) - 1)
If Left(path, 2)= "\\" Then
path = "\\?\UNC\" & right(path , Len(path) - 3) & "\*.*"
Else
path = "\\?\" & path & "\*.*"
End If

hFind = FindFirstFile(StrPtr(path), wfd)
If hFind <> INVALID_HANDLE_VALUE Then
Do
temP = Trim(Replace(wfd.cFileName, Chr(0), ""))
If temP <> "." And temP <> ".." Then
Select Case wfd.dwFileAttributes
Case 16
With rsD
.AddNew
!Name = src & "\" & temP
.Update
End With
getEm(src & "\" & temP)
Case Else
With rsF
.AddNew
!Name = src & "\" & temP
!ShortPath = src & "\" &
Trim(Replace(wfd.cAlternate, Chr(0), ""))
.Update
End With
End Select
End If
Loop Until FindNextFile(hFind, wfd) = 0
End If
Call FindClose(hFind) ' Clean up.
End Sub

Function Prod_Path() As String
On Error GoTo Err_pathdialog

Dim fd As FileDialog
Set fd = Application.FileDialog(msoFileDialogFolderPicker)
Dim vrtSelectedItem As Variant

With fd
If .Show = -1 Then
For Each vrtSelectedItem In .SelectedItems
' MsgBox "The file folder is: " & vrtSelectedItem
temP = vrtSelectedItem
Next vrtSelectedItem
End If
End With
Do While tempP = ""
temP = Prod_Path() 'recurse until a directory is chosen
Loop
Prod_Path = temP
Exit_pathdialog:
Exit Function

Err_pathdialog:
MsgBox Err.Description
Resume Exit_pathdialog

End Function
 
B

Bob Butler

temP = Trim(Replace(wfd.cFileName, Chr(0), ""))

That line is incorrect; the buffer can contain garbage after the initial
Null character so you are keeping the garbage. Something like this would be
closer:
temP = Left$(wfd.cFileName, Instr(1,wfd.cFileName,vbNullChar)-1)
 
K

Karl E. Peterson

krazymike said:
temP = Trim(Replace(wfd.cFileName, Chr(0), ""))

Try changing that to:

temP = TrimNull(wfd.cFileName)

Where:

Private Function TrimNull(ByVal Data As String) As String
Dim nNull As Long
nNull = InStr(Data, vbNullChar)
Select Case nNull
Case 0 ' Just do normal trim
TrimNull = Trim$(Data)
Case 1 ' Empty string
TrimNull = ""
Case Else
TrimNull = Left$(Data, nNull - 1)
End Select
End Function
 
S

Sinna

Karl said:
Try changing that to:

temP = TrimNull(wfd.cFileName)

Where:

Private Function TrimNull(ByVal Data As String) As String
Dim nNull As Long
nNull = InStr(Data, vbNullChar)
Select Case nNull
Case 0 ' Just do normal trim
TrimNull = Trim$(Data)
Case 1 ' Empty string
TrimNull = ""
Case Else
TrimNull = Left$(Data, nNull - 1)
End Select
End Function
Karl,

I don't see why you're splitting it up into three parts as Left$(foo, 0)
doesn't raise an error.

So I get:
<code>
lNullPos = InStr(1, Data, vbNullChar)
If lNullPos Then Data = Left(Data, lNullPos - 1)
</code>

Sinna
 
B

Ben Jones

The output from FindNextFileW is a Unicode string. When interpreted
by VB (ANSI), the two-byte-per-char Unicode is converted to one-byte-
per-char ANSI, resulting in alternating Null bytes throughout the VB
string. My replace is to remove all of them. Otherwise, that
TrimNull (which I was using) was only returning the first character
since the second one was a null.

I fixed my problem, though. Hit me as I was going to sleep. I
would've anticipated this in C++, but not VB. The .cFileName is
defined as a FIXED array of BYTES. So when the field is populated
with such a long value, and gets re-written with a smaller value only
the BYTES needed for the smaller value are replaced. When read, the
trailing bytes are still there, and thus returned.

Here's my fix:

For i = 0 To UBound(wfd.cFileName)
wfd.cFileName(i) = CByte(0)
Next
For i = 0 To UBound(wfd.cAlternate)
wfd.cAlternate(i) = CByte(0)
Next
Loop Until FindNextFile(hFind, wfd) = 0

Thus explicitly resetting these bytes back to the null state.

Thanks for your input.

krazymike

blnBozo_Bit = false
 
B

Bob Butler

Ben Jones said:
The output from FindNextFileW is a Unicode string. When interpreted
by VB (ANSI), the two-byte-per-char Unicode is converted to one-byte-
per-char ANSI, resulting in alternating Null bytes throughout the VB
string. My replace is to remove all of them.

The correct fix is to assign the byte array to a string which expects
unicode to unicode. Sinply stripping null bytes can corrupt the data.
Otherwise, that
TrimNull (which I was using) was only returning the first character
since the second one was a null.

No, the second byte was a null, not the second character.
I fixed my problem, though. Hit me as I was going to sleep. I
would've anticipated this in C++, but not VB. The .cFileName is
defined as a FIXED array of BYTES. So when the field is populated
with such a long value, and gets re-written with a smaller value only
the BYTES needed for the smaller value are replaced. When read, the
trailing bytes are still there, and thus returned.

Here's my fix:

For i = 0 To UBound(wfd.cFileName)
wfd.cFileName(i) = CByte(0)
Next
For i = 0 To UBound(wfd.cAlternate)
wfd.cAlternate(i) = CByte(0)
Next
Loop Until FindNextFile(hFind, wfd) = 0

No need to waste time filling the arrays on every call. The result is
terminated by a null character so once you assign the byte array to a string
you just have to trim it at the first vbNullChar (Chr$(0)).
 
K

Karl E. Peterson

Sinna said:
I don't see why you're splitting it up into three parts as Left$(foo, 0)
doesn't raise an error.

Good point. I wanted to say that, at some point, there was a version of MSBASIC
that didn't like that. But I just tried it in some earlier versions, and don't see
that happening at all. Maybe I was trying to "future-proof" it, back when there was
still a future for the language? I'll probably update to reflect that.

Thanks... Karl
 
K

Karl E. Peterson

Ben said:
The output from FindNextFileW is a Unicode string.

In this case, it's actually a Unicode string stored in a Byte array.
When interpreted
by VB (ANSI), the two-byte-per-char Unicode is converted to one-byte-
per-char ANSI,

No, it's not. When you assign a Byte array that contains a Unicode string to a
standard String variable, there is no conversion of any sort needed or that takes
place.
resulting in alternating Null bytes throughout the VB
string. My replace is to remove all of them.

I'm not exactly sure why it always comes down to an argument with you?

You are incorrect.
Otherwise, that
TrimNull (which I was using) was only returning the first character
since the second one was a null.
Hardly.

I fixed my problem, though. Hit me as I was going to sleep. I
would've anticipated this in C++, but not VB.

Hmmmmm, perhaps that's the problem? This isn't your native language.
The .cFileName is
defined as a FIXED array of BYTES. So when the field is populated
with such a long value, and gets re-written with a smaller value only
the BYTES needed for the smaller value are replaced.

Well, duh. You haven't changed the size or the contents of the buffer at all, have
you?
When read, the
trailing bytes are still there, and thus returned.

Of course they are.
Here's my fix:

For i = 0 To UBound(wfd.cFileName)
wfd.cFileName(i) = CByte(0)
Next
For i = 0 To UBound(wfd.cAlternate)
wfd.cAlternate(i) = CByte(0)
Next
Loop Until FindNextFile(hFind, wfd) = 0

Thus explicitly resetting these bytes back to the null state.

Totally unnecessary.

Fwiw, your earlier thread on this topic got me curious enough to write a functional
equivalent of VB's Dir() using the Find*W functions. There's *absolutely* no need
to be resetting the buffer between calls. It's a total waste of cycles.
Thanks for your input.

A pleasure, normally. Just wish I felt like you were open to it.
 
S

Sinna

Ben said:
The output from FindNextFileW is a Unicode string. When interpreted
by VB (ANSI), the two-byte-per-char Unicode is converted to one-byte-
per-char ANSI, resulting in alternating Null bytes throughout the VB
string. My replace is to remove all of them. Otherwise, that
TrimNull (which I was using) was only returning the first character
since the second one was a null.

I fixed my problem, though. Hit me as I was going to sleep. I
would've anticipated this in C++, but not VB. The .cFileName is
defined as a FIXED array of BYTES. So when the field is populated
with such a long value, and gets re-written with a smaller value only
the BYTES needed for the smaller value are replaced. When read, the
trailing bytes are still there, and thus returned.

Here's my fix:

For i = 0 To UBound(wfd.cFileName)
wfd.cFileName(i) = CByte(0)
Next
For i = 0 To UBound(wfd.cAlternate)
wfd.cAlternate(i) = CByte(0)
Next
Loop Until FindNextFile(hFind, wfd) = 0

Thus explicitly resetting these bytes back to the null state.

Thanks for your input.

krazymike
Ben,

There's really no need to clear the buffers as you do.
Why don't use this:

<code>
Dim sFileName As String, lNullPos As Long
sFileName = wfd.cFileName '***
lNullPos = InStr(1, sFileName, vbNullChar)
If lNullPos Then sFileName = Left(sFileName, lNullPos - 1)
</code>

The line containing the *** performs a conversion from a Unicode String
stored in a byte array to a Unicode String stored as string.
It really doesn't matter that there's a left over of a previous cycle as
you're only retaining the part before the first vbNullChar.


Sinna
 
B

Ben Jones

Well, I do apologize. Sincerely. I've never been good when it comes
to tone. In person or online. Working with the OS at such a low
level is new to me. I've always worked at a more GUI/ WinForm level.
Dealing with these sorts of conversion was never necessary in any of
the apps I've worked on before.

I had tried that trimNull and just had one character return.
Apparently, some other part of my code was bad at that time, because I
uncommented the code and re-ran it. It works now. So, my only
conclusion is that is was my error somewhere that caused that issue.

Can you explain this? My code, thanks to you, is running properly and
quickly. I don't know how to resize the buffer, or why it's
necessary.
 
K

Karl E. Peterson

Ben said:
Well, I do apologize. Sincerely. I've never been good when it comes
to tone. In person or online.

Accepted and understood. ASCII can be that way. :)
Working with the OS at such a low
level is new to me. I've always worked at a more GUI/ WinForm level.
Dealing with these sorts of conversion was never necessary in any of
the apps I've worked on before.

This UniMess thing threw a lot of us when we first encountered it! Understanding
the StrConv function is essential -- both when to use it, and when not to use it.
Can you explain this? My code, thanks to you, is running properly and
quickly. I don't know how to resize the buffer, or why it's
necessary.

It's not at all necessary. That's the point. The API keeps using the same buffer
over and over. You just trim it at the null terminator. (My turn to apologize
about the "duh" thing. <g>)
 
C

Carey Gregory

krazymike said:
yes i made that path intentionally to test long filepaths. That one's 309
characters.

That's a good trick considering that the maximum path length on Windows is
256.
 
B

Ben Jones

That's a good trick considering that the maximum path length on Windows is

Try it. Go to your command prompt. (replace q: with an available
drive letter)

Type: subst q: "c:\documents and settings\all users\desktop"

Then dir > q:
\testingifyoursystemcanreallyhavemorethantwohundredfiftyfivecharactersinapathliketheguyongooglegroupssaidhedidbutireallydontbelievehimbecuasemicrosoftsaidsomethingelseandtheyneverlieheresabunchmorecharacterstofillupthemaxpath..txt

Nice ugly filename. Unmount the q: drive with subst q: /D

Now try to double-click the text document on your desktop.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top