Skip to content

3148-backport-2395-to-Pharo-70--Non-ASCII-class-and-author-names-break-SourceFileArraygetPreambleFromat- #3149

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
28 changes: 28 additions & 0 deletions src/System-Sources/SourceFile.class.st
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,34 @@ SourceFile >> fullName [
^ path asString
]

{ #category : #accessing }
SourceFile >> getPreambleAt: startingPosition [
"Search backwards from byte startingPosition in my stream for a method preamble and return it.
A method preamble looks like: MyClass methodsFor: 'test' stamp: 'author 1/27/2019 12:27'
but with exclamation marks ($!) around it (the contents excluding the $!'s is returned).
startingPosition should be set one position before the closing $!"

| characterReadStream binaryStream encoder position |
"I hold either a ZnCharacterReadStream or a ZnCharacterReadWriteStream (see #tryOpenReadOnly:)
Use #isReadOnly and #readOnlyCopy to access the ZnCharacterReadStream in both cases"
characterReadStream := self isReadOnly ifTrue: [ stream ] ifFalse: [ stream readOnlyCopy ].
"Access the binary read stream wrapped by the character read stream"
binaryStream := characterReadStream wrappedStream.
"Access the encoder held by the character read stream"
encoder := characterReadStream encoder.
"Search backwards for the previous occurrence of $!
Although the underlying encoding is UTF-8 we can still operate/move at the byte level
since $! code 33 cannot occur in code points encoded using 2, 3 or 4 bytes"
position := startingPosition.
[ position >= 0
and: [
binaryStream position: position.
binaryStream next ~= 33 "$!" ] ]
whileTrue: [ position := position - 1 ].
"Now that we found the byte range, extract and decode it"
^ encoder decodeBytes: (binaryStream next: startingPosition - position)
]

{ #category : #testing }
SourceFile >> isOpen [

Expand Down
12 changes: 1 addition & 11 deletions src/System-Sources/SourceFileArray.class.st
Original file line number Diff line number Diff line change
Expand Up @@ -329,17 +329,7 @@ SourceFileArray >> forceChangesToDisk [

{ #category : #'public - string reading' }
SourceFileArray >> getPreambleFrom: aFileStream at: position [
"To read preamble of method we need read back characters until $!.
But given aFileStream can have UTF8 encoding which make it a bit tricky to scan stream in reverse order.
First we need to move back in stream using byte reading without encoding. It performed by basicNext.
Next we still could not read exact number of characters because (position-startIndex) is not characters number in case of UTF8.
So at the end we use another loop to read encoded characters step by step until stream will be at original position"

| startIndex |
startIndex := position.
[ startIndex >= 0 and: [ aFileStream position: startIndex. aFileStream next ~~ $! ] ] whileTrue: [ startIndex := startIndex - 1 ].

^ String streamContents: [ :result | result nextPutAll: (aFileStream next: position - aFileStream position + 1) ]
^ aFileStream getPreambleAt: position
]

{ #category : #initialization }
Expand Down