Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explicit Tail Calls #3407

Open
wants to merge 87 commits into
base: master
Choose a base branch
from
Open

Explicit Tail Calls #3407

wants to merge 87 commits into from

Conversation

phi-go
Copy link

@phi-go phi-go commented Apr 6, 2023

This RFC proposes a feature to provide a guarantee that function calls are tail-call eliminated via the become keyword. If this guarantee can not be provided an error is generated instead.

Rendered

For reference, previous RFCs #81 and #1888, as well as an earlier issue #271, and the currently active issue #2691.

Rust tracking issue: rust-lang/rust#112788

text/0000-guaranteed-tco.md Outdated Show resolved Hide resolved
text/0000-guaranteed-tco.md Outdated Show resolved Hide resolved
text/0000-guaranteed-tco.md Outdated Show resolved Hide resolved
text/0000-guaranteed-tco.md Outdated Show resolved Hide resolved
@ehuss ehuss added the T-lang Relevant to the language team, which will review and decide on the RFC. label Apr 6, 2023
@Robbepop
Copy link
Contributor

Robbepop commented Apr 6, 2023

thanks a ton @phi-go for all the work you put into writing the RFC so far! Really appreciated! 🎉

text/0000-guaranteed-tco.md Outdated Show resolved Hide resolved
text/0000-guaranteed-tco.md Outdated Show resolved Hide resolved
text/0000-guaranteed-tco.md Outdated Show resolved Hide resolved
text/0000-guaranteed-tco.md Outdated Show resolved Hide resolved
@clarfonthey
Copy link
Contributor

While I personally know the abbreviation TCO, I think that it would be helpful to expand the acronym in the issue title for folks who might not know it at first glance.

@phi-go phi-go changed the title Guaranteed TCO Guaranteed TCO (tail call optimization) Apr 6, 2023
text/0000-guaranteed-tco.md Outdated Show resolved Hide resolved
text/0000-guaranteed-tco.md Outdated Show resolved Hide resolved
text/0000-guaranteed-tco.md Outdated Show resolved Hide resolved
text/0000-guaranteed-tco.md Outdated Show resolved Hide resolved
text/0000-guaranteed-tco.md Outdated Show resolved Hide resolved
text/0000-guaranteed-tco.md Outdated Show resolved Hide resolved
text/0000-guaranteed-tco.md Outdated Show resolved Hide resolved
text/0000-guaranteed-tco.md Outdated Show resolved Hide resolved
text/0000-guaranteed-tco.md Outdated Show resolved Hide resolved
@VitWW
Copy link

VitWW commented Apr 6, 2023

An alternative is to mark a function (like in OCaml), not a return keyword:

recursive fn x() {
    let a = Box::new(());
    let b = Box::new(());
    y(a);
}

@Robbepop
Copy link
Contributor

Robbepop commented Apr 6, 2023

An alternative is to mark a function (like in OCaml), not a return keyword:

recursive fn x() {
    let a = Box::new(());
    let b = Box::new(());
    y(a);
}

The RFC already highlights an alternative design with markers on function declarations and states that tail calls are a property of the function call and not a property of a function declaration since there are use cases where the same function is used in a normal call and a tail call.

@digama0
Copy link
Contributor

digama0 commented Apr 6, 2023

Note: this may be suitable either as a comment on #2691 or here. I'm assuming interested parties are watching both anyway.

The restriction on caller and callee having the exact same signature sounds quite restrictive in practice. Comparing it with [Pre-RFC] Safe goto with value (which does a similar thing to the become keyword but for intra-function control flow instead of across functions), the reason that proposal doesn't have any requirements on labels having the same arguments is because all parameters in all labels are part of the same call frame, and when a local is used by different label than the current one it is still there in memory, just uninitialized.

If we translate it to the become approach, that basically means that each function involved in a mutual recursion would (internally) take the union of all the parameters of all of the functions, and any parameters not used by the current function would just be uninitialized. There are two limitations to this approach:

  • It changes the calling convention of the function (since you have to pad out the arguments with these uninitialized locals)
  • The calling convention depends on what other functions are called in the body

I don't think this should be handled by an actual calling convention label though. Calling these "rusttail" functions for discussion purposes, we have the following limitations:

  • You can't get a function pointer with "rusttail" convention
  • You can't use a "rusttail" function cross-crate

The user doesn't have to see either of these limitations directly though; the compiler can just generate a shim with the usual (or specified) calling convention which calls the "rusttail" function if required. Instead, they just observe the following restrictions, which are strictly weaker than the one in the RFC:

  • If you become a function in another crate, the arguments of caller and callee have to match exactly
  • If you only become a function in the same crate, there are no restrictions
  • (If f tail-calls a cross-crate function g and also an internal function h, then f,g,h must all have the same arguments)

text/0000-guaranteed-tco.md Outdated Show resolved Hide resolved
@comex
Copy link

comex commented Jul 4, 2023

@digama0 I created a thread in the unsafe-code-guidelines repo (for lack of a better place) to discuss how feasible it is to formally require tail call elimination.

Going back to this RFC, I agree that it doesn't need to be blocked by a precise definition. In lieu of one, I would prefer to go with a handwavy requirement to bound stack usage rather than just calling it QoI. But whatever. There probably won't be some smart-aleck author of an alternative implementation who comes in and says "I'm not implementing proper tail calls because the spec doesn't require it". Probably.

I agree with others that it's critical for rustc, at least, to implement explicit tail calls in a way that guarantees bounded stack usage on all supported targets. So either the 'MVP WebAssembly' target needs to be meaningfully deprecated (in favor of WebAssembly with the more recent tail call feature), or we need to mark tail-callable functions with an attribute.

I don't see what's so bad about the attribute. I don't want to go in circles here, but… the RFC says of requiring an attribute that "while quite noisy it is also less flexible than the chosen approach." But requiring an exact function signature match is drastically less flexible! Regardless of whether the initial implementation allows non-matching signatures, we should want to support them eventually.

Without an attribute on tail-callable functions, supporting non-signature-matched calls would require changing the default extern "Rust" calling convention on all targets from caller-cleanup to callee-cleanup. I remember someone claiming somewhere that there might be a performance cost to that. Maybe there isn't. But right now we don't know.

If we require an attribute for now, then we preserve the ability to support non-signature-matched calls (either in the initial implementation or in the future), regardless of the feasibility of changing the default calling convention. And we also make it possible to support tail calls on MVP WebAssembly via trampolines, at least during the deprecation process.

If in the future changing the default calling convention turns out to be feasible, and MVP WebAssembly has been properly deprecated, then the attribute can just become a no-op at that point.

(Regarding a C backend, I've always wanted Rust to have one, but I don't think it's important enough to block tail calls, particularly given that it currently does not exist.)

@digama0
Copy link
Contributor

digama0 commented Jul 4, 2023

I don't see what's so bad about the attribute. I don't want to go in circles here, but… the RFC says of requiring an attribute that "while quite noisy it is also less flexible than the chosen approach." But requiring an exact function signature match is drastically less flexible! Regardless of whether the initial implementation allows non-matching signatures, we should want to support them eventually.

I'll take a moment here to pitch again my earlier suggestion to use same-crate calls instead of an attribute. The compiler can make this Just Work™ by analyzing the call structure and making shims to interface with any other code that wants to take a reference to the function pointer or call the function from outside the crate. The overall effect is that you can become any function in the same crate with an arbitrary signature, and become any function in any other crate with the same signature. This can be done with no impact on extern "Rust" fn calling convention.

@DemiMarie
Copy link

@comex: So there is actually a trick one can use to avoid needing the attribute: the only functions that need to be able to be tail called are the ones that themselves contain tail calls. Therefore, one can include “has tail calls” in the ABI of a function, and use callee-pops calling conventions/trampolines/etc for precisely the functions that themselves contain become.

@programmerjake
Copy link
Member

programmerjake commented Jul 5, 2023

@comex: So there is actually a trick one can use to avoid needing the attribute: the only functions that need to be able to be tail called are the ones that themselves contain tail calls. Therefore, one can include “has tail calls” in the ABI of a function, and use callee-pops calling conventions/trampolines/etc for precisely the functions that themselves contain become.

i doubt that will work because those functions can be called through function pointers and the cast to extern "Rust" function pointers loses the tail/not-tail ABI distinction.

@CAD97
Copy link

CAD97 commented Jul 5, 2023

While somewhat annoying to keep track of in the compiler, the workaround is quite simple: you have two entry points to the function. When called statically it uses the tail call convention, but when a function pointer is taken, it gets a small additional shim to the standard call convention.

Unless the attribute would change the function to not be extern "Rust" and/or prevent making annotated functions into function pointers, the same thing needs to be handled there as well. And the same goes for any sort of trampoline transform; the standard entry looks normal from the outside and sets up the trampoline for the real implementation.

Though it should still be noted explicitly that such a scheme of course prevents tail calling a function pointer, since it's using the standard ABI rather than the tail ABI.

@programmerjake
Copy link
Member

Though it should still be noted explicitly that such a scheme of course prevents tail calling a function pointer, since it's using the standard ABI rather than the tail ABI.

but tail calling function pointers is necessary for fast interpreters which is a major motivation for become:

union Imm {
    branch_target: *const Inst,
    value: isize,
}

struct Inst {
    run: unsafe fn(pc: *const Inst, stack: *mut u64, mem: *mut u8),
    imm: Imm,
}

unsafe fn add_imm(pc: *const Inst, stack: *mut u64, mem: *mut u8) {
    *stack = (*stack).wrapping_add((*pc).imm.value as u64);
    let pc = pc.add(1);
    become (*pc).run(pc, stack, mem)
}

unsafe fn branch_if(pc: *const Inst, stack: *mut u64, mem: *mut u8) {
    let v = *stack;
    let stack = stack.add(1);
    let pc = if v != 0 {
        (*pc).imm.branch_target
    } else {
        pc.add(1)
    };
    become (*pc).run(pc, stack, mem)
}

// more instructions...

@digama0
Copy link
Contributor

digama0 commented Jul 5, 2023

Though it should still be noted explicitly that such a scheme of course prevents tail calling a function pointer, since it's using the standard ABI rather than the tail ABI.

but tail calling function pointers is necessary for fast interpreters which is a major motivation for become:

Wow, I'm feeling quite un-heard here. You can have both! There is no technical restriction to having both same-sig function pointer calls powered by the mechanism in the RFC, and same-crate arbitrary-sig tail calls powered by a compiler transform and a shim for interfacing with code that needs the standard calling convention. You don't even need an annotation to disambiguate them, the compiler can figure out what case you are in automatically.

Your example is fine since it is covered by the mechanism specified in the RFC. And I think it will be a general pattern with become -to- function pointer that they will satisfy the same signature restriction, because you need them all to have the same function pointer type to put them in an array or what have you. The correct form of the restriction @CAD97 is highlighting is that because function pointers use the standard ABI, they must unconditionally satisfy the same signature restriction, even for same-crate calls.

@Robbepop
Copy link
Contributor

Robbepop commented Jul 5, 2023

Having the same-crate restriction in place without the restriction of same-signature for those calls would be another game changer for interpreters built on tail calls as it would allow us avoid unsafe code alltogether:

type Register = usize;

enum Trap {
    DivisionByZero,
}

enum Op {
    I32Add { result: Register, lhs: Register, rhs: Register },
    I32Div { result: Register, lhs: Register, rhs: Register },
    BrIf { condition: Register, offset: isize },
}

struct Executor {
    stack: Vec<i32>,
    ops: Vec<Op>,
    sp: usize,
    pc: usize,
}

impl Executor {
    fn dispatch(&mut self) -> Result<(), Trap> {
        match self.ops[self.pc] {
            Op::I32Add { result, lhs, rhs } => become self.execute_i32_add(result, lhs, rhs),
            Op::I32Div { result, lhs, rhs } => become self.execute_i32_div(result, lhs, rhs),
            Op::BrIf { condition, offset } => become self.execute_br_if(condition, offset),
        }
    }

    fn execute_i32_add(&mut self, result: Register, lhs: Register, rhs: Register) -> Result<(), Trap> {
        self.stack[self.sp + result] = self.stack[self.sp + lhs].wrapping_add(self.stack[self.sp + rhs]);
        self.pc += 1;
        become self.dispatch()
    }

    fn execute_i32_div(&mut self, result: Register, lhs: Register, rhs: Register) -> Result<(), Trap> {
        let rhs = self.stack[self.sp + rhs];
        if rhs == 0 {
            return Err(Trap::DivisionByZero)
        }
        self.stack[self.sp + result] = self.stack[self.sp + lhs].wrapping_div(rhs);
        self.pc += 1;
        become self.dispatch()
    }

    fn execute_br_if(&mut self, condition: Register, offset: isize) -> Result<(), Trap> {
        if self.stack[self.sp + condition] != 0 {
            self.pc = self.pc.wrapping_add_signed(offset);
        } else {
            self.pc += 1;
        }
        become self.dispatch()
    }
}

@phi-go
Copy link
Author

phi-go commented Jul 5, 2023

@comex wrote:

I don't see what's so bad about the attribute. I don't want to go in circles here, but… the RFC says of requiring an attribute that "while quite noisy it is also less flexible than the chosen approach." But requiring an exact function signature match is drastically less flexible! Regardless of whether the initial implementation allows non-matching signatures, we should want to support them eventually.

To be clear the RFC currently only discusses guaranteed TCE given matching function signatures to keep the feature as small as possible. This is also the context this sentence in the RFC should be seen in.

Regarding the attribute on the function declaration are you thinking of requiring it in addition to become (1) or automatically promote a tail call of those functions (2)?

(2) is the version discussed in the RFC and I hope it is clear why it is described as less flexible.

(1) Has not really been discussed as far as I remember. However, to my understanding, even LLVM only supports functions that have nearly identical ABI's. WebAssembly currently does not even allow TCO. In both cases I expect the push to support non-matching (or matching for WebAssembly) function signatures to be done at a later point, at that time we can still require attributes on function declarations if needed. In the meantime I would expect the WebAssembly backend to raise a compiler error as per the RFC. Though, I can see that it could be unappealing to introduce this attribute later, so I'm not sure this would be the right approach. Note that I expect that the attribute is mainly interesting for tail calling function pointers and it's "marker" would need to become part of the function signature. For same-crate static calls I think the suggestion by @digama0 should be preferred. This, however, would imply that the attribute need only be added for function that are tail called via function pointer with mismatched function signature, which could be confusing.

@scottmcm
Copy link
Member

scottmcm commented Jul 5, 2023

It would seem entirely reasonable to me to add a extern "rust-tail" fn ABI to the tail call experiment -- under a separate feature flag! -- that would allow greater flexibility in signatures that can be used in tail calls, even though that-ABI function pointers, at the cost of breaking signature compatibility with normal extern "Rust" fn pointers.

I would expect that to be less -- or at least simpler -- work than all the syntax and drop order changes in the experiment, so adding more options to try out and see how they fit with different things that people wish to be able to do sounds like a good way to help find out which way work more smoothly for things people want to be able to do.

(We could have tail_call_syntax, tail_call_same_signature, tail_call_abi feature gates, perhaps, and figure out which parts of things to RFC & stabilize, if any.)

@digama0
Copy link
Contributor

digama0 commented Jul 6, 2023

It would seem entirely reasonable to me to add a extern "rust-tail" fn ABI to the tail call experiment -- under a separate feature flag! -- that would allow greater flexibility in signatures that can be used in tail calls, even though that-ABI function pointers, at the cost of breaking signature compatibility with normal extern "Rust" fn pointers.

Is there a proposal that actually requires having a separate ABI for tail-callable functions (or is it tail-calling functions?)? The RFC proposal works just fine with extern "Rust" fn, and the same-crate compilation strategy relies on the internal functions not being directly referenced at all (it is for compiler-only use), and as such wouldn't fit as extern "rust-tail" fn since different individual functions with the same argument and returns could nevertheless have a different call ABI, depending on what other functions are involved in the call graph strongly connected component they participate in.

I think @DemiMarie mentioned that a caller-pops calling convention could be more flexible wrt varying arguments, but I'm not sure this is a good default choice and if we have a specific extern "rust-tail" fn ABI then it seems inevitable that people would use it for callee-pops since otherwise what is the point of having a separate calling convention if it's just the same as the old one? I guess there should be some plan for what this calling convention should actually be and what tradeoffs it is making before making it a mandatory part of the become syntax.

@scottmcm
Copy link
Member

scottmcm commented Jul 6, 2023

Is there a proposal that actually requires having a separate ABI for tail-callable functions (or is it tail-calling functions?)?

Well, given that LLVM has a specific tailcc calling convention, which ensures that calls in tail position are always tail-optimized.

rustc currently uses LLVM's ccc (as can be seen by the last of explicit cc in the LLVM IR output) for extern "rust", so assuming it doesn't do nothing, it seems plausibly useful for things like tables of function pointers or cross-crate becomes or something.

Do we need it? I don't know. But it at least seems like a plausible experiment, like we have an existing experiment for coldcc (rust-lang/rust#97544).

But tailcc (or fastcc) uses callee-pops, so calls that aren't TCO'd need to adjust the stack, and thus it sounds like we wouldn't necessarily want to change everything to that cc.

@WaffleLapkin
Copy link
Member

@digama0 the extern "rust-tail" fn proposal is similar to your "same-crate" one, in that it allows more code in some cases. I.e. I can imagine the end-game requirement of (musttail) become be something like

become f(...) requires at least one of the following:

  • Caller and callee function signatures match exactly (modulo lifetimes)
  • Caller and callee are defined in the same crate
  • Caller and callee both use "rust-tail" calling convention

I'll try to implement those relaxations in the experiment (once the things described in the RFC right now are fully implemented and merged...), but I also want to highlight @phi-go's mention that those relaxations can be RFC-ed separately from this RFC (and they probably should, to keep the scope smaller!).

@joshtriplett
Copy link
Member

@Robbepop Using ? with that meaning seems like a non-starter given Rust's existing usage of ?.

@joshtriplett joshtriplett removed the I-lang-nominated Indicates that an issue has been nominated for prioritizing at the next lang team meeting. label Aug 1, 2023
@joshtriplett
Copy link
Member

Un-nominating the RFC for now.

To be explicitly clear on next steps: 👍 for going ahead with one or more experiments (including a tail-call with placeholder keyword, and a tail call calling-convention as @scottmcm suggested), blocking concerns for an experiment or RFC that uses the demo-looks-done approach of become (not dead-set against it but very much currently feeling there are better alternatives).

@Robbepop
Copy link
Contributor

Robbepop commented Aug 10, 2023

@Robbepop Using ? with that meaning seems like a non-starter given Rust's existing usage of ?.

I wouldn't say it is a non-starter. The same reasoning could have been applied to the highly debated .await syntax were .something only referred to field accesses before. I know why it isn't good to introduce "new" syntax but sometimes it takes courage to try something new. And new doesn't always mean it is a bad thing.

blocking concerns for an experiment or RFC that uses the demo-looks-done approach of become (not dead-set against it but very much currently feeling there are better alternatives).

If during all the time of the RFC thread I'd have seen a single alternative syntax that was actually cutting it, I'd wholeheartedly agree. Yet, we do not even has consensus what an agreeable placeholder syntax could look like. A bit of direction from the people who are deciding over nomination could probably help here.

In light of the RFC's un-nomination I would like to know what the status of the work behind the RFC is as this RFC thread has been very silent since quite a few weeks now. What in particular is the stance of @WaffleLapkin (RFC implementer) and @phi-go (RFC author) about the next steps provided by @joshtriplett ?

I personally just hope that this RFC won't die due to too much bikeshedding. It would be really sad to lose the momentum that has been built up for this long awaited Rust feature.

@phi-go
Copy link
Author

phi-go commented Aug 10, 2023

To my understanding we are just waiting for the implementation of the experiment. Here is the tracking issue: rust-lang/rust#112788. So un-nominating until the implementation is done seems fine to me.

Regarding the placeholder syntax, we are indeed waiting on a decision for the actual syntax. The current state is that the current implementation uses become and it seems non-trivial to change but @WaffleLapkin will know better.

I find become? quite intuitive, so I would have hoped for some contemplation. The only other candidate syntaxes that have been discussed for longer are using an attribute, and the return variation.

bors added a commit to rust-lang/rust-analyzer that referenced this pull request Feb 14, 2024
feature: Add basic support for `become` expr/tail calls

This follows rust-lang/rfcs#3407 and my WIP implementation in the compiler.

Notice that I haven't even *opened* a compiler PR (although I plan to soon), so this feature doesn't really exist outside of my WIP branches. I've used this to help me test my implementation; opening a PR before I forget.

(feel free to ignore this for now, given all of the above)
@lolbinarycat
Copy link
Contributor

the rfc does not link to the tracking issue

@phi-go
Copy link
Author

phi-go commented Sep 28, 2024

@lolbinarycat Looking at other RFCs, the tracking issue does not seem to be included. A link is in the first comment here anyway, so seems fine to me.

@lolbinarycat
Copy link
Contributor

a lot of other RFCs don't have tracking issues.

it's supposed to go under the Rust Issue: #0000 bit, replacing the placeholder 0000.

@phi-go
Copy link
Author

phi-go commented Sep 28, 2024

I see, updated. I thought those should be filled in when the RFC is accepted.

@scottmcm
Copy link
Member

scottmcm commented Oct 1, 2024

I stumbled on this earlier: https://github.com/bytecodealliance/rfcs/blob/main/accepted/pulley.md

The wasmtime folks sound interested in using become for an interpreter.

@fitzgen
Copy link
Member

fitzgen commented Oct 2, 2024

The wasmtime folks sound interested in using become for an interpreter.

Indeed we are! We even have an open PR that is just about ready to merge and adds a switch to turn become usage on/off: bytecodealliance/wasmtime#9251

@traviscross
Copy link
Contributor

This came up multiple times, with positive affection, in recent extended lang meetings. And since then, I've become aware of an intriguing use case for this that involves interoperability with Rust and our ecosystem. Seemingly it is on our minds.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-lang Relevant to the language team, which will review and decide on the RFC.
Projects
None yet
Development

Successfully merging this pull request may close these issues.