Skip to content
This repository has been archived by the owner on May 21, 2022. It is now read-only.

Roadmap #1

Open
tbreloff opened this issue Oct 11, 2016 · 2 comments
Open

Roadmap #1

tbreloff opened this issue Oct 11, 2016 · 2 comments

Comments

@tbreloff
Copy link
Member

I've been thinking through the design of the meta/sub learners, and I think it will be pretty reasonable to get Optim functionality in this framework. Here's a rough concept of what I think that looks like:

# keep vector of: ‖θ_true - θ‖
tracer = Tracer(Float64, (model,i) -> norm- params(model)))

# build the MetaLearner
learner = make_learner(
    GradientLearner(...),
    TimeLimit(60), # stop after 60 seconds
    MaxIter(1000), # stop after 1000 iterations
    ShowStatus(100), # show a status update before iterating and every 100 iterations
    Converged(params, tol=1e-6), # similar to x_converged for the function-case
    Converged(output, tol=1e-6), # similar to f_converged for the function-case
    Converged(grad, tol=1e-6, every=10), # similar to g_converged for the function-case
                                         # note: we can also only check every ith iteration
    tracer
)

# learn is like optimize.
learn!(model, learner)

# note: for the function minimization case, it "iterates" over obs == nothing, since the
#   `x` in f(x) is treated as learnable parameters in Transformations.Differentiable, NOT input

Something like this would replace most of Optim.optimize, and individual algorithms would be implemented by creating sub-learners to do whatever is "special" as compared to common algos, reusing other components as necessary. For example, you may replace the SearchDirection within the GradientLearner with something specialized for that method.

Line search and similar could be made generic as implementations of LearningRate, which is a sub-learner that knows how to calculate the learning rate for a step.

Common components could be added through a high-level api, similar to make_learner.

For a working example using SGD, see: https://github.com/JuliaML/StochasticOptimization.jl/blob/master/test/runtests.jl#L145

In the long term, beyond simple tracing, early stopping, and convergence checks, I plan on incorporating real time visualizations, animations, and more. So I think there's a benefit to get Optim functionality into this framework and pool resources.

cc: @pkofod @ChrisRackauckas @oxinabox

@ChrisRackauckas
Copy link

ChrisRackauckas commented Oct 11, 2016

How does this compare with the changes already happening with Optim.jl? I'm not as up-to-date: so much to follow! But yes, I think this design makes a lot of sense.

@tbreloff
Copy link
Member Author

@pkofod can hopefully answer better than me, but from what I've seen in the Optim codebase, there's implementation leakage and coupling of objects, tracing, temporary storage, etc. There are massive "state" objects that hold a range of diverse things and get passed all around, and I think adding new approaches/methods will sometimes require changes that touch a lot of files/code. I'm working toward alleviating a lot of these issues.

The "update refactor" probably decoupled pieces and made it more modular, but I'm not the one to comment there.

One thing that is not obvious... it's hard to make your own sub-components and inject them into the optimize loop, unless you can possibly wrap them successfully in an iteration callback. Frequently you need the callbacks from different parts of the iteration depending on the specific task, and that's tough right now.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants