Feat: `Std.FileIO.ReadFile` and `WriteFile` #6198

MikaelMayer · 2025-04-18T11:27:47Z

This PR adds an opinionated way to write Dafny strings to files and read them back using UTF-8 encoding.
With this PR, it becomes finally possible to create command-line utilities that read a file and process it in an easy way.

What was changed?

Added FileIO.ReadUTF8FromFile and FileIO.WriteUTF8ToFile so that it's easier to read/write strings to text files within Dafny.
FromUTF8Checked, FromUTF16Checked and DecodeCodeUnitSequenceChecked all now return a Result instead of an Option so that the error message is clearer.

Example

For example, here is a tool that displays the content of a file

import opened Std.FileIO
method Main(args: seq<string>) {
  expect |args| >= 2;
  var result :- expect ReadUTF8FromFile(args[1]);
  print result.value;  
}

Usage:

dafny run --standard-libraries cat.dfy cat.dfy

It prints itself.

How has this been tested?

I added tests to read and to write strings similar to the existing tests to read and write UFT-8 characters.

By submitting this pull request, I confirm that my contribution is made under the terms of the MIT license.

I added my own versions of FileIO.ReadFile and FileIO.WriteFile that I needed for my own application.

Fixes #6196 I updated the tests to ensure it was breaking and that this PR fixes the test. Basically, a test was testing if the type was a bitvector type, but this type was a synonym type with the new resolver so that needed to be updated. <small>By submitting this pull request, I confirm that my contribution is made under the terms of the [MIT license](https://github.com/dafny-lang/dafny/blob/master/LICENSE.txt).</small>

robin-aws

Definitely in favor of the feature!

Source/DafnyStandardLibraries/src/Std/TargetSpecific/FileIO5-notarget-java-cs-js-go-py.dfy

robin-aws · 2025-04-28T16:12:59Z

Source/DafnyStandardLibraries/src/Std/TargetSpecific/FileIO5-notarget-java-cs-js-go-py.dfy

+    if !Utf8EncodingForm.IsWellFormedCodeUnitSequence(bytes) {
+      return Failure("Byte sequence of file '" + fileName + "' is not well formed UTF8");
+    }
+    var x: seq<bv24> := Utf8EncodingForm.DecodeCodeUnitSequence(bytes);


I think you can avoid all the extra error checking by using UnicodeStringsWithUnicodeChar.FromUTF8Checked instead.

That method only returns an Option. This error checking makes it possible to precisely report at which index there is an error. I think I prefer that. Later, we could still refactoring this error checking code if it's used somewhere else.

Okay fair. It would be great to improve Unicode with more precise error messages as well at some point.

It does feel like you're reimplementing a fair bit of concepts from the Unicode module, and we have to be very careful to be consistent and sound.

For soundness, the good news is that all my code is only establishing the proof that I can call into the standard libraries, so we should be good for that.

Source/DafnyStandardLibraries/src/Std/TargetSpecific/FileIO5-notarget-java-cs-js-go-py.dfy

Source/DafnyCore/Resolver/PreType/PreTypeResolve.cs

robin-aws · 2025-04-29T17:28:08Z

Source/DafnyStandardLibraries/src/Std/TargetSpecific/FileIO5-notarget-java-cs-js-go-py.dfy

+    var x: seq<bv24> := Utf8EncodingForm.DecodeCodeUnitSequence(bytes);
+    for i := 0 to |x|
+      invariant forall k | 0 <= k < i :: x[k] < 1 << 16 {
+      if !(x[i] < 1 << 16) {


This is wrong, in --unicode-char:true mode (which is the only mode the libraries currently support) valid characters go up to 0x10FFFF, so you may reject valid data here.

As much as I'd love to do that, when I run make update-standard-libraries with the change you suggested,

invariant forall k | 0 <= k < i :: x[k] < 0x10FFFF { if !(x[i] < 0x10FFFF) { return Failure("index " + Strings.OfInt(i) + " is not a valid char"); }

I get.

FileIO5-notarget-java-cs-js-go-py.dfy(93,54): Error: bit-vector value to be converted might not fit in char | 93 | var s := seq(|x|, i requires 0 <= i < |x| => x[i] as char); | ^^

which is consistent with the verification condition we emit:

var toWidth = 16; if (toWidth < fromWidth) { // Check "expr < (1 << toWidth)" in type "fromType" (note that "1 << toWidth" is indeed a value in "fromType") PutSourceIntoLocal(); var toBound = BaseTypes.BigNum.FromBigInt(BigInteger.One << toWidth); // 1 << toWidth var bound = BplBvLiteralExpr(tok, toBound, fromType.AsBitVectorType); var boundsCheck = FunctionCall(expr.Origin, "lt_bv" + fromWidth, Bpl.Type.Bool, o, bound); var dafnyBound = new BinaryExpr(expr.Origin, BinaryExpr.Opcode.LeftShift, Expression.CreateIntLiteral(expr.Origin, 1), Expression.CreateIntLiteral(expr.Origin, toWidth)); var dafnyBoundsCheck = new BinaryExpr(expr.Origin, BinaryExpr.Opcode.Lt, expr, dafnyBound); builder.Add(Assert(tok, boundsCheck, new ConversionFit("bit-vector value", toType, dafnyBoundsCheck, errorMsgPrefix), builder.Context)); }

Ah that's because the correct check is to make sure the character isn't a surrogate as well. See here:

dafny/Source/DafnyCore/Verifier/ProofObligationDescription.cs

Line 1863 in 43f7d6c

public static Expression MakeCharBoundsCheckUnicode(Expression expr) {

I think you could simplify and just test if x[i] is char?

robin-aws · 2025-04-29T17:38:20Z

Source/DafnyStandardLibraries/src/Std/TargetSpecific/FileIO5-notarget-java-cs-js-go-py.dfy

+    if !Utf8EncodingForm.IsWellFormedCodeUnitSequence(bytes) {
+      return Failure("Byte sequence of file '" + fileName + "' is not well formed UTF8");
+    }
+    var x: seq<bv24> := Utf8EncodingForm.DecodeCodeUnitSequence(bytes);


Okay fair. It would be great to improve Unicode with more precise error messages as well at some point.

It does feel like you're reimplementing a fair bit of concepts from the Unicode module, and we have to be very careful to be consistent and sound.

…g/dafny into feat-fileio-read-write

robin-aws

Thanks so much for pushing the errors improvements into Unicode, love it! Just some cleanup before merging

robin-aws · 2025-04-30T18:39:36Z

Source/DafnyStandardLibraries/src/Std/TargetSpecific/FileIO5-notarget-java-cs-js-go-py.dfy

+    */
+  method ReadUTF8FromFile(fileName: string) returns (r: Result<string, string>) {
+    var bytes :- ReadBytesFromFile(fileName);
+    return UnicodeStringsWithUnicodeChar.FromUTF8Checked(seq(|bytes|, i requires 0 <= i < |bytes| => bytes[i] as uint8));


SO much better, thank you! The rest of the declarations below aside from WriteUTF8ToFile are dead code and can be deleted now, right?

robin-aws · 2025-04-30T18:40:46Z

docs/dev/news/fileioreadwrite.feat

I'd mention the improvements to error messages in Unicode as well, partially because it's a great improvement but also because it's technically breaking. Perhaps mention how to unbreak affected code as well with Result.ToOption()

robin-aws · 2025-04-30T18:42:23Z

Source/DafnyStandardLibraries/examples/FileIO/ReadFromFile.dfy

+      expect res.value == expectedStr;
+    }
+
+      // Failure path: attempting to read from a blank file path should never work.


A second failure case where the bytes are not UTF8 would be very valuable too, especially since AFAICT we don't prove that the index is correct. Could also put it in UnicodeExamples.dfy

UnicodeExamples already has a test for ill cases, so I just added 2 expect lines to demonstrate what to expect as an error message.

robin-aws · 2025-04-30T18:47:12Z

Source/DafnyStandardLibraries/src/Std/Unicode/UnicodeEncodingForm.dfy

@@ -127,43 +128,45 @@ abstract module Std.Unicode.UnicodeEncodingForm {

  /**
    * Returns the unique partition of the given code unit sequence into minimal well-formed code unit subsequences,
-    * or None if no such partition exists.
+    * or Failure(CodeUnitSeq) if no such partition exists.


Can you be more specific and say it returns the suffix that couldn't be partitioned? I had to read a fair bit of other code to figure that out.

robin-aws · 2025-04-30T18:48:13Z

docs/dev/news/fileioreadwrite.feat

+With `--standard-libraries` you can now read an UTF-8 text files from the disk using `Std.FileIO.ReadFile(path: string): Result<string, string>`.
+To write some content to the disk, use `Std.FileIO.WriteFile(path: string, content: string): Outcome<string>`.


Suggested change

With `--standard-libraries` you can now read an UTF-8 text files from the disk using `Std.FileIO.ReadFile(path: string): Result<string, string>`.

To write some content to the disk, use `Std.FileIO.WriteFile(path: string, content: string): Outcome<string>`.

With `--standard-libraries` you can now read an UTF-8 text files from the disk using `Std.FileIO.ReadUTF8FromFile(path: string): Result<string, string>`.

To write some content to the disk, use `Std.FileIO.WriteUTF8ToFile(path: string, content: string): Outcome<string>`.

This PR adds an opinionated way to write Dafny strings to files and read them back using UTF-8 encoding. With this PR, it becomes finally possible to create command-line utilities that read a file and process it in an easy way. ### What was changed? - Added `FileIO.ReadUTF8FromFile` and `FileIO.WriteUTF8ToFile` so that it's easier to read/write strings to text files within Dafny. - `FromUTF8Checked`, `FromUTF16Checked` and `DecodeCodeUnitSequenceChecked` all now return a `Result` instead of an `Option` so that the error message is clearer. ### Example For example, here is a tool that displays the content of a file ``` import opened Std.FileIO method Main(args: seq<string>) { expect |args| >= 2; var result :- expect ReadUTF8FromFile(args[1]); print result.value; } ``` Usage: ``` dafny run --standard-libraries cat.dfy cat.dfy ``` It prints itself. ### How has this been tested? I added tests to read and to write strings similar to the existing tests to read and write UFT-8 characters. <small>By submitting this pull request, I confirm that my contribution is made under the terms of the [MIT license](https://github.com/dafny-lang/dafny/blob/master/LICENSE.txt).</small>

MikaelMayer and others added 12 commits April 16, 2025 15:39

Feat: FileIO.ReadFile and FileIO.WriteFile

4a5ca47

I added my own versions of FileIO.ReadFile and FileIO.WriteFile that I needed for my own application.

Updated the standard libraries

c9c82d7

Update dfyconfig.toml

f11ee5c

Merge branch 'master' into feat-fileio-read-write

0ab2e7c

Added missing legacy case

5ededcb

Merge branch 'master' into feat-fileio-read-write

aafac08

Updated doo files

24bfcde

Regenerated files

8f9a0b9

Updated resolution errors

34eb454

Merge branch 'master' into feat-fileio-read-write

70476b3

Fixed test

68460e4

robin-aws requested changes Apr 28, 2025

View reviewed changes

MikaelMayer and others added 3 commits April 28, 2025 15:48

Merge branch 'master' into feat-fileio-read-write

d9c73d9

Updated std libs

5981986

Update WriteToFile.dfy

c4c693d

robin-aws requested changes Apr 29, 2025

View reviewed changes

MikaelMayer added 2 commits April 30, 2025 12:30

Result instead of option

df4f730

Merge branch 'feat-fileio-read-write' of https://github.com/dafny-lan…

3bdf5cb

…g/dafny into feat-fileio-read-write

robin-aws requested changes Apr 30, 2025

View reviewed changes

MikaelMayer added 3 commits April 30, 2025 23:29

Review comments

ac4caad

Merge branch 'master' into feat-fileio-read-write

0560297

Fixed ReadUTF8FromFile.dfy

ccd950e

robin-aws previously approved these changes May 1, 2025

View reviewed changes

MikaelMayer added 3 commits May 1, 2025 19:41

Formatting

1a7b855

Merge branch 'master' into feat-fileio-read-write

a85b03d

Formatting needs regeneration

9a59a11

MikaelMayer dismissed robin-aws’s stale review via 9a59a11 May 2, 2025 00:51

Merge branch 'master' into feat-fileio-read-write

a5deab7

MikaelMayer enabled auto-merge (squash) May 2, 2025 12:53

robin-aws approved these changes May 2, 2025

View reviewed changes

MikaelMayer merged commit 5c84042 into master May 2, 2025
22 checks passed

MikaelMayer deleted the feat-fileio-read-write branch May 2, 2025 15:09

botantony mentioned this pull request Aug 26, 2025

dafny 4.11.0 Homebrew/homebrew-core#234941

Merged

		With `--standard-libraries` you can now read an UTF-8 text files from the disk using `Std.FileIO.ReadFile(path: string): Result<string, string>`.
		To write some content to the disk, use `Std.FileIO.WriteFile(path: string, content: string): Outcome<string>`.

Feat: Std.FileIO.ReadFile and WriteFile #6198

Feat: Std.FileIO.ReadFile and WriteFile #6198

Uh oh!

Conversation

MikaelMayer commented Apr 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What was changed?

Example

How has this been tested?

Uh oh!

robin-aws left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

robin-aws left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Feat: `Std.FileIO.ReadFile` and `WriteFile` #6198

Feat: `Std.FileIO.ReadFile` and `WriteFile` #6198

MikaelMayer commented Apr 18, 2025 •

edited

Loading