Skip to content

Commit 4d959a0

Browse files
ambvencukousethmlarsonAA-Turnerserhiy-storchaka
authored andcommitted
[3.10] pythongh-135034: Normalize link targets in tarfile, add os.path.realpath(strict='allow_missing') (pythonGH-135037)
Addresses CVEs 2024-12718, 2025-4138, 2025-4330, and 2025-4517. (cherry picked from commit 3612d8f) (cherry picked from commit c358142) (cherry picked from commit 371b4ea) Co-authored-by: Łukasz Langa <[email protected]> Signed-off-by: Łukasz Langa <[email protected]> Co-authored-by: Petr Viktorin <[email protected]> Co-authored-by: Seth Michael Larson <[email protected]> Co-authored-by: Adam Turner <[email protected]> Co-authored-by: Serhiy Storchaka <[email protected]>
1 parent 880adf6 commit 4d959a0

File tree

11 files changed

+1056
-131
lines changed

11 files changed

+1056
-131
lines changed

Doc/library/os.path.rst

Lines changed: 29 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -352,10 +352,26 @@ the :mod:`glob` module.)
352352
links encountered in the path (if they are supported by the operating
353353
system).
354354

355-
If a path doesn't exist or a symlink loop is encountered, and *strict* is
356-
``True``, :exc:`OSError` is raised. If *strict* is ``False``, the path is
357-
resolved as far as possible and any remainder is appended without checking
358-
whether it exists.
355+
By default, the path is evaluated up to the first component that does not
356+
exist, is a symlink loop, or whose evaluation raises :exc:`OSError`.
357+
All such components are appended unchanged to the existing part of the path.
358+
359+
Some errors that are handled this way include "access denied", "not a
360+
directory", or "bad argument to internal function". Thus, the
361+
resulting path may be missing or inaccessible, may still contain
362+
links or loops, and may traverse non-directories.
363+
364+
This behavior can be modified by keyword arguments:
365+
366+
If *strict* is ``True``, the first error encountered when evaluating the path is
367+
re-raised.
368+
In particular, :exc:`FileNotFoundError` is raised if *path* does not exist,
369+
or another :exc:`OSError` if it is otherwise inaccessible.
370+
371+
If *strict* is :py:data:`os.path.ALLOW_MISSING`, errors other than
372+
:exc:`FileNotFoundError` are re-raised (as with ``strict=True``).
373+
Thus, the returned path will not contain any symbolic links, but the named
374+
file and some of its parent directories may be missing.
359375

360376
.. note::
361377
This function emulates the operating system's procedure for making a path
@@ -374,6 +390,15 @@ the :mod:`glob` module.)
374390
.. versionchanged:: 3.10
375391
The *strict* parameter was added.
376392

393+
.. versionchanged:: next
394+
The :py:data:`~os.path.ALLOW_MISSING` value for the *strict* parameter
395+
was added.
396+
397+
.. data:: ALLOW_MISSING
398+
399+
Special value used for the *strict* argument in :func:`realpath`.
400+
401+
.. versionadded:: next
377402

378403
.. function:: relpath(path, start=os.curdir)
379404

Doc/library/tarfile.rst

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -237,6 +237,15 @@ The :mod:`tarfile` module defines the following exceptions:
237237
Raised to refuse extracting a symbolic link pointing outside the destination
238238
directory.
239239

240+
.. exception:: LinkFallbackError
241+
242+
Raised to refuse emulating a link (hard or symbolic) by extracting another
243+
archive member, when that member would be rejected by the filter location.
244+
The exception that was raised to reject the replacement member is available
245+
as :attr:`!BaseException.__context__`.
246+
247+
.. versionadded:: next
248+
240249

241250
The following constants are available at the module level:
242251

@@ -954,6 +963,12 @@ reused in custom filters:
954963
Implements the ``'data'`` filter.
955964
In addition to what ``tar_filter`` does:
956965

966+
- Normalize link targets (:attr:`TarInfo.linkname`) using
967+
:func:`os.path.normpath`.
968+
Note that this removes internal ``..`` components, which may change the
969+
meaning of the link if the path in :attr:`!TarInfo.linkname` traverses
970+
symbolic links.
971+
957972
- :ref:`Refuse <tarfile-extraction-refuse>` to extract links (hard or soft)
958973
that link to absolute paths, or ones that link outside the destination.
959974

@@ -982,6 +997,10 @@ reused in custom filters:
982997

983998
Return the modified ``TarInfo`` member.
984999

1000+
.. versionchanged:: next
1001+
1002+
Link targets are now normalized.
1003+
9851004

9861005
.. _tarfile-extraction-refuse:
9871006

@@ -1008,6 +1027,7 @@ Here is an incomplete list of things to consider:
10081027
* Extract to a :func:`new temporary directory <tempfile.mkdtemp>`
10091028
to prevent e.g. exploiting pre-existing links, and to make it easier to
10101029
clean up after a failed extraction.
1030+
* Disallow symbolic links if you do not need the functionality.
10111031
* When working with untrusted data, use external (e.g. OS-level) limits on
10121032
disk, memory and CPU usage.
10131033
* Check filenames against an allow-list of characters

Doc/whatsnew/3.10.rst

Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2394,3 +2394,37 @@ email
23942394
check if the *strict* paramater is available.
23952395
(Contributed by Thomas Dwyer and Victor Stinner for :gh:`102988` to improve
23962396
the CVE-2023-27043 fix.)
2397+
2398+
2399+
Notable changes in 3.10.18
2400+
==========================
2401+
2402+
os.path
2403+
-------
2404+
2405+
* The *strict* parameter to :func:`os.path.realpath` accepts a new value,
2406+
:data:`os.path.ALLOW_MISSING`.
2407+
If used, errors other than :exc:`FileNotFoundError` will be re-raised;
2408+
the resulting path can be missing but it will be free of symlinks.
2409+
(Contributed by Petr Viktorin for :cve:`2025-4517`.)
2410+
2411+
tarfile
2412+
-------
2413+
2414+
* :func:`~tarfile.data_filter` now normalizes symbolic link targets in order to
2415+
avoid path traversal attacks.
2416+
(Contributed by Petr Viktorin in :gh:`127987` and :cve:`2025-4138`.)
2417+
* :func:`~tarfile.TarFile.extractall` now skips fixing up directory attributes
2418+
when a directory was removed or replaced by another kind of file.
2419+
(Contributed by Petr Viktorin in :gh:`127987` and :cve:`2024-12718`.)
2420+
* :func:`~tarfile.TarFile.extract` and :func:`~tarfile.TarFile.extractall`
2421+
now (re-)apply the extraction filter when substituting a link (hard or
2422+
symbolic) with a copy of another archive member, and when fixing up
2423+
directory attributes.
2424+
The former raises a new exception, :exc:`~tarfile.LinkFallbackError`.
2425+
(Contributed by Petr Viktorin for :cve:`2025-4330` and :cve:`2024-12718`.)
2426+
* :func:`~tarfile.TarFile.extract` and :func:`~tarfile.TarFile.extractall`
2427+
no longer extract rejected members when
2428+
:func:`~tarfile.TarFile.errorlevel` is zero.
2429+
(Contributed by Matt Prodani and Petr Viktorin in :gh:`112887`
2430+
and :cve:`2025-4435`.)

Lib/genericpath.py

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88

99
__all__ = ['commonprefix', 'exists', 'getatime', 'getctime', 'getmtime',
1010
'getsize', 'isdir', 'isfile', 'samefile', 'sameopenfile',
11-
'samestat']
11+
'samestat', 'ALLOW_MISSING']
1212

1313

1414
# Does a path exist?
@@ -153,3 +153,12 @@ def _check_arg_types(funcname, *args):
153153
f'os.PathLike object, not {s.__class__.__name__!r}') from None
154154
if hasstr and hasbytes:
155155
raise TypeError("Can't mix strings and bytes in path components") from None
156+
157+
# A singleton with a true boolean value.
158+
@object.__new__
159+
class ALLOW_MISSING:
160+
"""Special value for use in realpath()."""
161+
def __repr__(self):
162+
return 'os.path.ALLOW_MISSING'
163+
def __reduce__(self):
164+
return self.__class__.__name__

Lib/ntpath.py

Lines changed: 23 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,8 @@
3030
"ismount", "expanduser","expandvars","normpath","abspath",
3131
"curdir","pardir","sep","pathsep","defpath","altsep",
3232
"extsep","devnull","realpath","supports_unicode_filenames","relpath",
33-
"samefile", "sameopenfile", "samestat", "commonpath"]
33+
"samefile", "sameopenfile", "samestat", "commonpath",
34+
"ALLOW_MISSING"]
3435

3536
def _get_bothseps(path):
3637
if isinstance(path, bytes):
@@ -571,9 +572,10 @@ def abspath(path):
571572
from nt import _getfinalpathname, readlink as _nt_readlink
572573
except ImportError:
573574
# realpath is a no-op on systems without _getfinalpathname support.
574-
realpath = abspath
575+
def realpath(path, *, strict=False):
576+
return abspath(path)
575577
else:
576-
def _readlink_deep(path):
578+
def _readlink_deep(path, ignored_error=OSError):
577579
# These error codes indicate that we should stop reading links and
578580
# return the path we currently have.
579581
# 1: ERROR_INVALID_FUNCTION
@@ -606,7 +608,7 @@ def _readlink_deep(path):
606608
path = old_path
607609
break
608610
path = normpath(join(dirname(old_path), path))
609-
except OSError as ex:
611+
except ignored_error as ex:
610612
if ex.winerror in allowed_winerror:
611613
break
612614
raise
@@ -615,7 +617,7 @@ def _readlink_deep(path):
615617
break
616618
return path
617619

618-
def _getfinalpathname_nonstrict(path):
620+
def _getfinalpathname_nonstrict(path, ignored_error=OSError):
619621
# These error codes indicate that we should stop resolving the path
620622
# and return the value we currently have.
621623
# 1: ERROR_INVALID_FUNCTION
@@ -642,17 +644,18 @@ def _getfinalpathname_nonstrict(path):
642644
try:
643645
path = _getfinalpathname(path)
644646
return join(path, tail) if tail else path
645-
except OSError as ex:
647+
except ignored_error as ex:
646648
if ex.winerror not in allowed_winerror:
647649
raise
648650
try:
649651
# The OS could not resolve this path fully, so we attempt
650652
# to follow the link ourselves. If we succeed, join the tail
651653
# and return.
652-
new_path = _readlink_deep(path)
654+
new_path = _readlink_deep(path,
655+
ignored_error=ignored_error)
653656
if new_path != path:
654657
return join(new_path, tail) if tail else new_path
655-
except OSError:
658+
except ignored_error:
656659
# If we fail to readlink(), let's keep traversing
657660
pass
658661
path, name = split(path)
@@ -683,16 +686,24 @@ def realpath(path, *, strict=False):
683686
if normcase(path) == normcase(devnull):
684687
return '\\\\.\\NUL'
685688
had_prefix = path.startswith(prefix)
689+
690+
if strict is ALLOW_MISSING:
691+
ignored_error = FileNotFoundError
692+
strict = True
693+
elif strict:
694+
ignored_error = ()
695+
else:
696+
ignored_error = OSError
697+
686698
if not had_prefix and not isabs(path):
687699
path = join(cwd, path)
688700
try:
689701
path = _getfinalpathname(path)
690702
initial_winerror = 0
691-
except OSError as ex:
692-
if strict:
693-
raise
703+
except ignored_error as ex:
694704
initial_winerror = ex.winerror
695-
path = _getfinalpathname_nonstrict(path)
705+
path = _getfinalpathname_nonstrict(path,
706+
ignored_error=ignored_error)
696707
# The path returned by _getfinalpathname will always start with \\?\ -
697708
# strip off that prefix unless it was already provided on the original
698709
# path.

Lib/posixpath.py

Lines changed: 11 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -35,7 +35,7 @@
3535
"samefile","sameopenfile","samestat",
3636
"curdir","pardir","sep","pathsep","defpath","altsep","extsep",
3737
"devnull","realpath","supports_unicode_filenames","relpath",
38-
"commonpath"]
38+
"commonpath", "ALLOW_MISSING"]
3939

4040

4141
def _get_sep(path):
@@ -407,6 +407,15 @@ def _joinrealpath(path, rest, strict, seen):
407407
sep = '/'
408408
curdir = '.'
409409
pardir = '..'
410+
getcwd = os.getcwd
411+
if strict is ALLOW_MISSING:
412+
ignored_error = FileNotFoundError
413+
elif strict:
414+
ignored_error = ()
415+
else:
416+
ignored_error = OSError
417+
418+
maxlinks = None
410419

411420
if isabs(rest):
412421
rest = rest[1:]
@@ -429,7 +438,7 @@ def _joinrealpath(path, rest, strict, seen):
429438
newpath = join(path, name)
430439
try:
431440
st = os.lstat(newpath)
432-
except OSError:
441+
except ignored_error:
433442
if strict:
434443
raise
435444
is_link = False

0 commit comments

Comments
 (0)