• 0 Posts
  • 21 Comments
Joined 1 year ago
cake
Cake day: June 25th, 2023

help-circle

  • Python 2 had one mostly-working str class, and a mostly-broken unicode class.

    Python 3, for some reason, got rid of the one that mostly worked, leaving no replacement. The closest you can get is to spam surrogateescape everywhere, which is both incorrect and has significant performance cost - and that still leaves several APIs unavailable.

    Simply removing str indexing would’ve fixed the common user mistake if that was really desirable. It’s not like unicode indexing is meaningful either, and now large amounts of historical data can no longer be accessed from Python.





  • Then - ignoring dunders that have weird rules - what, pray tell, is the point of protocols, other than backward compatibility with historical fragile ducks (at the cost of future backwards compatibility)? Why are people afraid of using real base classes?

    The fact that it is possible to subclass a Protocol is useless since you can’t enforce subclassing, which is necessary for maintainable software refactoring, unless it’s a purely internal interface (in which case the Union approach is probably still better).

    That PEP link includes broken examples so it’s really not worth much as a reference.

    (for that matter, the Sequence interface is also broken in Python, in case you need another historical example of why protocols are a bad idea).



  • o11c@programming.devtoPython@programming.devProtocols in Python
    link
    fedilink
    English
    arrow-up
    1
    arrow-down
    1
    ·
    1 year ago

    In practice, Protocols are a way to make “superclasses” that you can never add features to (for example, readinto despite being critical for performance is utterly broken in Python). This should normally be avoided at almost all costs, but for some reason people hate real base classes?

    If you really want to do something like the original article, where there’s a C-implemented class that you can’t change, you’re best off using a (named) Union of two similar types, not a Protocol.

    I suppose they are useful for operator overloading but that’s about it. But I’m not sure if type checkers actually implement that properly anyway; overloading is really nasty in a dynamically-typed language.


  • All of these can be done with raw strings just fine.

    For the first pathlib bug case, PATH-like lookup is common, not just for binaries but also data and conf files. If users explicitly request ./foo they will be very upset if your program instead looks at /defaultpath/foo. Also, God forbid you dare pass a Path("./--help") to some program. If you’re using os.path.dirname this works just fine.

    For the second pathlib bug case, dir/ is often written so that you’ll cause explicit errors if there’s a file by that name. Also there are programs like rsync where the trailing slash outright changes the meaning of the command. Again, os.path APIs give you the correct result.

    For the article mistake, backslash is a perfectly legal character in non-Windows filenames and should not be treated as a directory component separator. Thankfully, pathlib doesn’t make this mistake at least. OTOH, / is reasonable to treat as a directory component separator on Windows (and some native APIs already handle it, though normalization is always a problem).

    I also just found that the pathlib.Path constructor ignores extra kwargs. But Python has never bothered much with safety anyway, and this minor compared to the outright bugs the other issues cause.




  • Honestly you probably should think about how to translate them. Python at least rolls its own .mo parser so it can support multiple languages in a single process; it’s much more difficult in C unless you push it to the clients (which requires pushing the parameterization as well).

    Non-.pot-based internationalization formats are almost always braindead and should be avoided.



  • No. Duck types (including virtual subclasses) considered harmful; use real inheritance if your language doesn’t provide anything strictly better.

    It is incomparably convenient to be able to retroactively add “default implementations” to interface functions (consider for example how broken readinto is in Python). Some statically-typed languages let you do that without inheritance, but no dynamically-typed language can.

    This reads more as a rant against inheritance (without any explanation whatsoever) than a legitimate argument.


  • You should have part of your test harness perform a separate import of every module. If your module is idempotent (most good code is) you could do this in a single process by cleaning sys.modules I guess … but it still won’t be part of your pytest process.

    Static analyzers can only detect some cases, so can’t be fully trusted.

    I’ve also found there are a lot of cases where performant Python code has to be implemented in a distinct way from what the type-checker sees. You can do this with aggressive type: ignore but I often find it cleaner to use separate if blocks.



  • The with approach would work if you use the debugger to change the current line I think.

    I don’t understand why this stop using ASTs in favor of buggy regexes - you’re allowed to do whatever you want during the codec …

    Don’t forget to handle increment before continue.

    The main time I miss C-style for loops is dealing with linked lists and when manipulating the current iteration.

    The former should be easy enough - make the advancement provide __getattr__ expressions.

    The latter already works since it is in fact being transformed into a while. It’s impossible if you try to use for though.


  • I’m increasingly convinced that Python/JS-style duck typing is always a mistake, since you can’t do default function impls for traits. Just use inheritance.

    Rust’s enums are even weirder, since they mix structuring with discrimination. You end up having to write everything twice most of the time. Again, use inheritance, though you’ll have to choose between if chains and virtual function calls.

    Python’s pathlib has a major footgun in that ./foo collapses to foo, negating the main point of writing it that way in the first place.