Skip to content

Commit 371b4ea

Browse files
ambvencukousethmlarsonAA-Turnerserhiy-storchaka
authored andcommitted
[3.11] pythongh-135034: Normalize link targets in tarfile, add os.path.realpath(strict='allow_missing') (pythonGH-135037)
Addresses CVEs 2024-12718, 2025-4138, 2025-4330, and 2025-4517. (cherry picked from commit 3612d8f) (cherry picked from commit c358142) Co-authored-by: Łukasz Langa <[email protected]> Signed-off-by: Łukasz Langa <[email protected]> Co-authored-by: Petr Viktorin <[email protected]> Co-authored-by: Seth Michael Larson <[email protected]> Co-authored-by: Adam Turner <[email protected]> Co-authored-by: Serhiy Storchaka <[email protected]>
1 parent f8b4421 commit 371b4ea

File tree

11 files changed

+1169
-134
lines changed

11 files changed

+1169
-134
lines changed

Doc/library/os.path.rst

Lines changed: 29 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -352,10 +352,26 @@ the :mod:`glob` module.)
352352
links encountered in the path (if they are supported by the operating
353353
system).
354354

355-
If a path doesn't exist or a symlink loop is encountered, and *strict* is
356-
``True``, :exc:`OSError` is raised. If *strict* is ``False``, the path is
357-
resolved as far as possible and any remainder is appended without checking
358-
whether it exists.
355+
By default, the path is evaluated up to the first component that does not
356+
exist, is a symlink loop, or whose evaluation raises :exc:`OSError`.
357+
All such components are appended unchanged to the existing part of the path.
358+
359+
Some errors that are handled this way include "access denied", "not a
360+
directory", or "bad argument to internal function". Thus, the
361+
resulting path may be missing or inaccessible, may still contain
362+
links or loops, and may traverse non-directories.
363+
364+
This behavior can be modified by keyword arguments:
365+
366+
If *strict* is ``True``, the first error encountered when evaluating the path is
367+
re-raised.
368+
In particular, :exc:`FileNotFoundError` is raised if *path* does not exist,
369+
or another :exc:`OSError` if it is otherwise inaccessible.
370+
371+
If *strict* is :py:data:`os.path.ALLOW_MISSING`, errors other than
372+
:exc:`FileNotFoundError` are re-raised (as with ``strict=True``).
373+
Thus, the returned path will not contain any symbolic links, but the named
374+
file and some of its parent directories may be missing.
359375

360376
.. note::
361377
This function emulates the operating system's procedure for making a path
@@ -374,6 +390,15 @@ the :mod:`glob` module.)
374390
.. versionchanged:: 3.10
375391
The *strict* parameter was added.
376392

393+
.. versionchanged:: next
394+
The :py:data:`~os.path.ALLOW_MISSING` value for the *strict* parameter
395+
was added.
396+
397+
.. data:: ALLOW_MISSING
398+
399+
Special value used for the *strict* argument in :func:`realpath`.
400+
401+
.. versionadded:: next
377402

378403
.. function:: relpath(path, start=os.curdir)
379404

Doc/library/tarfile.rst

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -239,6 +239,15 @@ The :mod:`tarfile` module defines the following exceptions:
239239
Raised to refuse extracting a symbolic link pointing outside the destination
240240
directory.
241241

242+
.. exception:: LinkFallbackError
243+
244+
Raised to refuse emulating a link (hard or symbolic) by extracting another
245+
archive member, when that member would be rejected by the filter location.
246+
The exception that was raised to reject the replacement member is available
247+
as :attr:`!BaseException.__context__`.
248+
249+
.. versionadded:: next
250+
242251

243252
The following constants are available at the module level:
244253

@@ -1037,6 +1046,12 @@ reused in custom filters:
10371046
Implements the ``'data'`` filter.
10381047
In addition to what ``tar_filter`` does:
10391048

1049+
- Normalize link targets (:attr:`TarInfo.linkname`) using
1050+
:func:`os.path.normpath`.
1051+
Note that this removes internal ``..`` components, which may change the
1052+
meaning of the link if the path in :attr:`!TarInfo.linkname` traverses
1053+
symbolic links.
1054+
10401055
- :ref:`Refuse <tarfile-extraction-refuse>` to extract links (hard or soft)
10411056
that link to absolute paths, or ones that link outside the destination.
10421057

@@ -1065,6 +1080,10 @@ reused in custom filters:
10651080

10661081
Return the modified ``TarInfo`` member.
10671082

1083+
.. versionchanged:: next
1084+
1085+
Link targets are now normalized.
1086+
10681087

10691088
.. _tarfile-extraction-refuse:
10701089

@@ -1091,6 +1110,7 @@ Here is an incomplete list of things to consider:
10911110
* Extract to a :func:`new temporary directory <tempfile.mkdtemp>`
10921111
to prevent e.g. exploiting pre-existing links, and to make it easier to
10931112
clean up after a failed extraction.
1113+
* Disallow symbolic links if you do not need the functionality.
10941114
* When working with untrusted data, use external (e.g. OS-level) limits on
10951115
disk, memory and CPU usage.
10961116
* Check filenames against an allow-list of characters

Doc/whatsnew/3.11.rst

Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2786,3 +2786,37 @@ email
27862786
check if the *strict* paramater is available.
27872787
(Contributed by Thomas Dwyer and Victor Stinner for :gh:`102988` to improve
27882788
the CVE-2023-27043 fix.)
2789+
2790+
2791+
Notable changes in 3.11.13
2792+
==========================
2793+
2794+
os.path
2795+
-------
2796+
2797+
* The *strict* parameter to :func:`os.path.realpath` accepts a new value,
2798+
:data:`os.path.ALLOW_MISSING`.
2799+
If used, errors other than :exc:`FileNotFoundError` will be re-raised;
2800+
the resulting path can be missing but it will be free of symlinks.
2801+
(Contributed by Petr Viktorin for :cve:`2025-4517`.)
2802+
2803+
tarfile
2804+
-------
2805+
2806+
* :func:`~tarfile.data_filter` now normalizes symbolic link targets in order to
2807+
avoid path traversal attacks.
2808+
(Contributed by Petr Viktorin in :gh:`127987` and :cve:`2025-4138`.)
2809+
* :func:`~tarfile.TarFile.extractall` now skips fixing up directory attributes
2810+
when a directory was removed or replaced by another kind of file.
2811+
(Contributed by Petr Viktorin in :gh:`127987` and :cve:`2024-12718`.)
2812+
* :func:`~tarfile.TarFile.extract` and :func:`~tarfile.TarFile.extractall`
2813+
now (re-)apply the extraction filter when substituting a link (hard or
2814+
symbolic) with a copy of another archive member, and when fixing up
2815+
directory attributes.
2816+
The former raises a new exception, :exc:`~tarfile.LinkFallbackError`.
2817+
(Contributed by Petr Viktorin for :cve:`2025-4330` and :cve:`2024-12718`.)
2818+
* :func:`~tarfile.TarFile.extract` and :func:`~tarfile.TarFile.extractall`
2819+
no longer extract rejected members when
2820+
:func:`~tarfile.TarFile.errorlevel` is zero.
2821+
(Contributed by Matt Prodani and Petr Viktorin in :gh:`112887`
2822+
and :cve:`2025-4435`.)

Lib/genericpath.py

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88

99
__all__ = ['commonprefix', 'exists', 'getatime', 'getctime', 'getmtime',
1010
'getsize', 'isdir', 'isfile', 'samefile', 'sameopenfile',
11-
'samestat']
11+
'samestat', 'ALLOW_MISSING']
1212

1313

1414
# Does a path exist?
@@ -153,3 +153,12 @@ def _check_arg_types(funcname, *args):
153153
f'os.PathLike object, not {s.__class__.__name__!r}') from None
154154
if hasstr and hasbytes:
155155
raise TypeError("Can't mix strings and bytes in path components") from None
156+
157+
# A singleton with a true boolean value.
158+
@object.__new__
159+
class ALLOW_MISSING:
160+
"""Special value for use in realpath()."""
161+
def __repr__(self):
162+
return 'os.path.ALLOW_MISSING'
163+
def __reduce__(self):
164+
return self.__class__.__name__

Lib/ntpath.py

Lines changed: 24 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,8 @@
3030
"ismount", "expanduser","expandvars","normpath","abspath",
3131
"curdir","pardir","sep","pathsep","defpath","altsep",
3232
"extsep","devnull","realpath","supports_unicode_filenames","relpath",
33-
"samefile", "sameopenfile", "samestat", "commonpath"]
33+
"samefile", "sameopenfile", "samestat", "commonpath",
34+
"ALLOW_MISSING"]
3435

3536
def _get_bothseps(path):
3637
if isinstance(path, bytes):
@@ -578,9 +579,10 @@ def abspath(path):
578579
from nt import _getfinalpathname, readlink as _nt_readlink
579580
except ImportError:
580581
# realpath is a no-op on systems without _getfinalpathname support.
581-
realpath = abspath
582+
def realpath(path, *, strict=False):
583+
return abspath(path)
582584
else:
583-
def _readlink_deep(path):
585+
def _readlink_deep(path, ignored_error=OSError):
584586
# These error codes indicate that we should stop reading links and
585587
# return the path we currently have.
586588
# 1: ERROR_INVALID_FUNCTION
@@ -613,7 +615,7 @@ def _readlink_deep(path):
613615
path = old_path
614616
break
615617
path = normpath(join(dirname(old_path), path))
616-
except OSError as ex:
618+
except ignored_error as ex:
617619
if ex.winerror in allowed_winerror:
618620
break
619621
raise
@@ -622,7 +624,7 @@ def _readlink_deep(path):
622624
break
623625
return path
624626

625-
def _getfinalpathname_nonstrict(path):
627+
def _getfinalpathname_nonstrict(path, ignored_error=OSError):
626628
# These error codes indicate that we should stop resolving the path
627629
# and return the value we currently have.
628630
# 1: ERROR_INVALID_FUNCTION
@@ -649,17 +651,18 @@ def _getfinalpathname_nonstrict(path):
649651
try:
650652
path = _getfinalpathname(path)
651653
return join(path, tail) if tail else path
652-
except OSError as ex:
654+
except ignored_error as ex:
653655
if ex.winerror not in allowed_winerror:
654656
raise
655657
try:
656658
# The OS could not resolve this path fully, so we attempt
657659
# to follow the link ourselves. If we succeed, join the tail
658660
# and return.
659-
new_path = _readlink_deep(path)
661+
new_path = _readlink_deep(path,
662+
ignored_error=ignored_error)
660663
if new_path != path:
661664
return join(new_path, tail) if tail else new_path
662-
except OSError:
665+
except ignored_error:
663666
# If we fail to readlink(), let's keep traversing
664667
pass
665668
path, name = split(path)
@@ -690,24 +693,32 @@ def realpath(path, *, strict=False):
690693
if normcase(path) == normcase(devnull):
691694
return '\\\\.\\NUL'
692695
had_prefix = path.startswith(prefix)
696+
697+
if strict is ALLOW_MISSING:
698+
ignored_error = FileNotFoundError
699+
strict = True
700+
elif strict:
701+
ignored_error = ()
702+
else:
703+
ignored_error = OSError
704+
693705
if not had_prefix and not isabs(path):
694706
path = join(cwd, path)
695707
try:
696708
path = _getfinalpathname(path)
697709
initial_winerror = 0
698710
except ValueError as ex:
699711
# gh-106242: Raised for embedded null characters
700-
# In strict mode, we convert into an OSError.
712+
# In strict modes, we convert into an OSError.
701713
# Non-strict mode returns the path as-is, since we've already
702714
# made it absolute.
703715
if strict:
704716
raise OSError(str(ex)) from None
705717
path = normpath(path)
706-
except OSError as ex:
707-
if strict:
708-
raise
718+
except ignored_error as ex:
709719
initial_winerror = ex.winerror
710-
path = _getfinalpathname_nonstrict(path)
720+
path = _getfinalpathname_nonstrict(path,
721+
ignored_error=ignored_error)
711722
# The path returned by _getfinalpathname will always start with \\?\ -
712723
# strip off that prefix unless it was already provided on the original
713724
# path.

Lib/posixpath.py

Lines changed: 11 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -35,7 +35,7 @@
3535
"samefile","sameopenfile","samestat",
3636
"curdir","pardir","sep","pathsep","defpath","altsep","extsep",
3737
"devnull","realpath","supports_unicode_filenames","relpath",
38-
"commonpath"]
38+
"commonpath", "ALLOW_MISSING"]
3939

4040

4141
def _get_sep(path):
@@ -427,6 +427,15 @@ def _joinrealpath(path, rest, strict, seen):
427427
sep = '/'
428428
curdir = '.'
429429
pardir = '..'
430+
getcwd = os.getcwd
431+
if strict is ALLOW_MISSING:
432+
ignored_error = FileNotFoundError
433+
elif strict:
434+
ignored_error = ()
435+
else:
436+
ignored_error = OSError
437+
438+
maxlinks = None
430439

431440
if isabs(rest):
432441
rest = rest[1:]
@@ -449,7 +458,7 @@ def _joinrealpath(path, rest, strict, seen):
449458
newpath = join(path, name)
450459
try:
451460
st = os.lstat(newpath)
452-
except OSError:
461+
except ignored_error:
453462
if strict:
454463
raise
455464
is_link = False

0 commit comments

Comments
 (0)