Unsoundness with `typing.IO` and friends #1994

JelleZijlstra · 2025-05-06T14:47:58Z

The classes typing.IO, typing.BinaryIO, and typing.TextIO look like they want to be Protocols or ABCs, but in fact they are defined as regular generic classes, both at runtime and in typeshed. However, they are meant to encompass the concrete IO classes defined in the io module, and in typeshed we implement that by having these classes inherit from typing.*IO classes, even though there is no such inheritance at runtime.

This can easily lead to unsound behavior:

import io, typing

def f(x: int | io.BytesIO) -> int:
    if isinstance(x, typing.BinaryIO):
        return x.fileno()
    return x

f(io.BytesIO()) + 1  # boom

Type checkers think BytesIO is a subclass of BinaryIO, because that's how it's defined in typeshed, but in fact it isn't at runtime.

I can see a few solutions:

Special-case typing.*IO in the spec and say that type checkers should reject isinstance()/issubclass() calls involving them.
Deprecate the typing.*IO classes and eventually remove them, nudging people to use their own Protocols (or the new io.Reader/io.Writer) instead. This is conceptually clean but may be annoying for a lot of users; the typing classes are nice to use in simple application code.
Make these classes actually (runtime-checkable?) Protocols at runtime, though they would be unwieldily large.

Even if we do (2) or (3) type checkers might still want to do (1) since it will be a while before the relevant runtime changes take effect.

(Noticed this while looking into python/cpython#133492 .)

The text was updated successfully, but these errors were encountered:

srittau · 2025-05-06T17:19:57Z

I still tend towards (2), although easy protocol composition would go a long way here. Reader and Writer were originally intended to be broader to encompass most cases, but became smaller in the process (which is probably for the best). But in python/cpython#133492 I suggested that we add a fairly broad protocol to _typeshed for now, maybe that's a good first step.

I also think we should looking into deprecating BinaryIO and TextIO first, before looking into IO. BinaryIO is basically identical with IO[bytes] (except that IO[bytes].__enter__ return IO not BinaryIO). TextIO has a few more methods, but they all look highly situational.

cmaloney · 2025-05-15T14:34:30Z

For these is there any way to differentiating/expose the difference in .write() behavior between "Raw I/O" (ex. io.FileIO) vs. high level I/O (BufferedIO and TextIO)? For "Raw I/O" if the write() system call only writes part of the buffer, it isn't retried while higher level I/O will loop until either the whole buffer is written or an exception is raied (non blocking fds have extra caveats).

JelleZijlstra · 2025-05-15T14:39:39Z

I don't think so; in general typing can only cover information in the method's interface (what parameters it take and of what types, and what it returns), not more subtle aspects of behavior.

JelleZijlstra added the topic: other Other topics not covered label May 6, 2025

DRayX mentioned this issue May 6, 2025

"typing.BinaryIO" missing "readinto" method python/cpython#133492

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Unsoundness with `typing.IO` and friends #1994

Unsoundness with `typing.IO` and friends #1994

JelleZijlstra commented May 6, 2025

srittau commented May 6, 2025

Uh oh!

cmaloney commented May 15, 2025

Uh oh!

JelleZijlstra commented May 15, 2025

Uh oh!

Unsoundness with typing.IO and friends #1994

Unsoundness with typing.IO and friends #1994

Comments

JelleZijlstra commented May 6, 2025

srittau commented May 6, 2025

Uh oh!

cmaloney commented May 15, 2025

Uh oh!

JelleZijlstra commented May 15, 2025

Uh oh!

Unsoundness with `typing.IO` and friends #1994

Unsoundness with `typing.IO` and friends #1994