octave-maintainers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Proceeding with the GSoC Project


From: Sudeepam Pandey
Subject: Re: Proceeding with the GSoC Project
Date: Fri, 27 Apr 2018 04:42:32 +0530



On Fri, Apr 27, 2018 at 3:23 AM, Bradley Kennedy <address@hidden> wrote:

On 2018-04-26, at 17:11, Doug Stewart <address@hidden> wrote:



On Thu, Apr 26, 2018 at 4:47 PM, Sudeepam Pandey <address@hidden> wrote:


On Fri, Apr 27, 2018 at 1:12 AM, Doug Stewart <address@hidden> wrote:


On Thu, Apr 26, 2018 at 3:26 PM, Sudeepam Pandey <address@hidden> wrote:
So I have set up my blog at [1] and a public repository at [2].

My public repo at bitbucket contains a branch by the name of "Did_you_mean" where I plan to push all the small changes that I make to my project. Then a large change-set can be pushed to the default branch.

a) Please take a look at these and inform me about anything that you'd like changed.
b) Kindly, also inform me about anything that needs to be done after setting up the repo and the blog,

Other than that, can anyone direct me to a link where all the existing functions of GNU Octave can be found in the form of a list? I know about this[3] link but over here, a description of the function is included with the function name. I essentially require, only the function names and so if I copy anything from this page, I'll have to clean it up first. If a list exists then its good otherwise I'll just proceed with the comprehensive list on that page.


Getting a list of all octave commands is one of the steps towards a goal.
getting a list of all package command is another step towards your goals. etc.

Would you like me to make any changes to the bitbucket repository or my blog?

I don't see any ideas as to how to do packages!  How are you going to handle the situation, that the user has a package loaded and is typing in a command word from the package? 

Yeah Doug,

As a person who researches in the area of machine learning I see many issues with using this for Octave. An immediate thought is “are we trying to adapt a problem to the solution?” Neural networks have issues like fixed input/output size (which you already ran into), model capacity, bias (as in mathematical bias), and catastrophic forgetting (with continual learning).

Doug brings up a great point, the only way NN would be feasible with dynamic loading of packages would be to either use continual learning (which might not work with one-hot encodings?), or retrain the model on package loading. But at some point we are also going to have increase model capacity in order to accomidate new function names as in the case. Also, you appear to be encoding your inputs as ASCII which is probably less than ideal given that assumes that inputs into your feature space are closer if their ASCII distance is closer. In natural language processing we usually use word embeddings but since the names of functions aren’t really natural it probably would apply. You could use LSTM or convolutions which may assist you with your input length problem but those are just going to slow you down and make it less feasible since you’re doing this all in Octave.

An smart implementation of edit distance would likely better serve Octave. Using assumptions about the errors that users would commonly make would allow you to lower your search space and since reasonable implemenations of edit distance are \theta(mn) with m and n being the target string lengths then we only have to multiply it by the number of functions, or perhaps names, in Octave. With some clever tricks you can reduce the number of functions you have to check in this scheme (usually most people know the first character of the function they want for instance might be sufficient, or only look at words that are similar in length, these are things you’d want to research first anyway, I’m sure it is a solved problem).

As the model capacity increases we also have to look at the complexity of the model as far as computation as well. Retraining the network with more examples will take longer and as we discovered I don’t think we can use a pretrained network.

So in summary the problems are:
- Scalability
 - Performance
 - Fixed input/output size
 - Training time
- Dynamic names
- Simpler solutions exist
- Model encoding is poor

I think the most common issue is going to be dynamic names, given we are using training examples from an external source they would have to be packaged somehow with the distro.

Some mixed method might be interesting but again I think a well designed edit-distance would serve better.

Cheers,
Brad Kennedy

Edit distance, indeed is an excellent algorithm for this job but I disregarded it, only because a 'smart' implementation would require some really important assumptions. Like for instance, assuming that the user will always type the first character correctly.
I have also gone through the problems that you have stated about neural networks. Most of these can also be solved with some reasonable assumptions, and I have concluded that the end decision about whether to use neural nets or edit distance ultimately lies with what assumptions do we consider more practical.

Examples of solutions to the problems you stated...

For the Input size problem, it is reasonable to assume that the user would at best, misspell a word with not more 3 extra characters, so we can keep the input layer size = 3 + longest length function of Octave.

Due to ASCII encoding, something like this will happen... say we have 2 proper words, 'amuse' and 'abuse', and someone types in 'atuse'. In such a case, the network by default would output 'amuse' because ASCII of 't' is closer to 'm' than it is to 'b'. I did not considered this to be a problem because essentially the network really is giving us 'the closest match', however, even if we do consider this to be a problem, considering all the classes within a probability range instead of the class of highest probability would solve it.

Just like we would require to add new function names to the edit distance check list, with each new Octave release, we would require to retrain a neural network to incorporate new functions with each release which is doable since we will be the ones making these changes at every release. The user won't be affected by this. Training time is a problem but again, training time won't be so large as to delay a release so its doable, obviously using GPU services is also an option. The increase in the complexity of the model with the increase in the capacity is real but likewise the edit distance algorithm will also face a similar increase in the complexity because it will have more words to look at.

The only real problem, in my view, with neural networks is the dynamic package loading. That, as I said, 'could be solved' by using a large neural network that incorporates all the existing functions of Octave (Core + Forge). I accept the fact that it may not be the most optimal solution though.

I think it would be the best if we come up with a solution to the problem with both the approaches. Then we can look at both the approaches and mutually decide what method takes in more practical assumptions and which one is more optimized.
  
 

 

On Thu, Apr 26, 2018 at 5:27 AM, Nicholas Jankowski <address@hidden> wrote:


On Wed, Apr 25, 2018, 17:56 Doug Stewart <address@hidden> wrote:


On Wed, Apr 25, 2018 at 5:49 PM, Sudeepam Pandey <address@hidden> wrote:
Thank you Doug. I have gone through these links. I'll will inform both of you after I finish the following tasks...

1) Initialize a public repository on Bitbucket for my project.
2) Setup a blog to report the weekly work, write an initial post, and link it to the Bitbucket repository.

It would be really helpful for me if both of you could share your preferred mode and time of communication with me.

Additionally, I would like to inform both of you and the entire Octave community in general, that Shane from octave-online had shared a little more than 100,000 user sessions, and a list of 1000 common misspellings with me via email previously.

I do have some technical doubts, some propositions and some discussions to make. Should we proceed with them here or take them back to bug 46681 [1] ?

[1]: https://savannah.gnu.org/bugs/?46881


I think that we should talk here.

Probably best. I'm UTC-4, so most days of the week async communication will be most practical, and email is as good as anything for that.





--
DAS





--
DAS




reply via email to

[Prev in Thread] Current Thread [Next in Thread]