How Deep Neural Networks Work

How Deep Neural Networks Work

100 thoughts on “How Deep Neural Networks Work

  1. Ugh… I feel pretty dumb now. Everybody seems to understand this perfectly but I got lost a bunch of times…

  2. Good vid thanks.
    Perhaps obvious to some but could be confusing: it should be understood the white and black pixels in the illustration are not indicative of the actual pixel color of the image as shown, merely 'containers' for the true pixel greyscale.

  3. How do you avoid getting stuck in a local minima? Or specifically how do you get to the global minima?

  4. I understand calculus and I feel from a non math perspective you explained this very well (physics background here)

  5. Ty for Polish subtitles sub as well from me and that diagram on yt says a lot of. you follow from start to end path just looking on the picture. Tvm great Idea to post it your video with this picture it save me a lot of time.ty. now I press play button. Question: how we program qbits? Witch tool ? What's a name of that languech? 16qbits so far… I know is a million dollar question.😀 But that's a new "basic" like with 8 bits in 1980s. I want to study that new one, but don't know a name of that compilator. Used in Google 16qbits machine. Btw 16 digits passwords are solved in fraction of second by Google new machine in -273.15 Celsius. And I liked that "puzzle" coz it took me 10 solid minutes to follow all paths and end up with same result as is on diagram. Even while I 've looked for flaws . 10 minutes coz I am not "bright"😀 just avrage.

  6. Liked the bit where you showed that direct mapping from input to output was not possible. And I’m ok with the description here of what a given layer might be doing. What I’d like to know is how best to optimize the number of layers and the number of neurons in each — hopefully via some method other than trial and error. Are there guidelines for this?

  7. Veja o Site completo www.VitaLiberu.pt ou Vitaliberu.pt para os Portugueses…

    Obrigado
    Paulo Roque Paulo Roque Silva

  8. Sure, you can't detect the pattern from JUST whether INDIVIDUAL cells are light or dark, but it's pretty easy to just hard code the logical circut for a pattern that size, much easier than training a neural network.

  9. I heard from my professor, whose research is in neural networks, that this method actually DOESN'T work for deep neural networks. The problem is that if you use backpropagation, then after five or so layers the end result gets too watered down from all the chaining to be detectable. As such, the distinction between deep learning and old school neural networks is the invention of new techniques that got around this problem. Do you know anything about this?

  10. Great explanation!
    Now I understand that weights are to be determined via training data.

    How about the network itself?
    1) You said you can add 2, 3, 7, 700 etc. layers.
    Is it arbitrary? How do we know/determine how many layers would be optimal?

    2) The connections in the network: are they arbitrary? Or how do we determine what is/are the best/optimal?

    The more hidden layers/nodes/connects there are, the more computationally demanding it is. But does it always make the neural network better in terms of being closer to predicting the truth?

    If you have not made it clear in the video, where do I find the materials that make these clear?

    Again thanks so much for the great lecture!

  11. Than you for the video. So the basic building blocks of a NN are weighted connectors , sigmoid function , and the linear rectifier according to this video. Do most people just put the sigmoid and rectifier in series after each receptor? If not how does one figure when to use sigmoid function or when to use the linear rectified unit ?

  12. Really great explanation Brandon. Also, I greatly appreciate that you share your slides as well and that too in raw (PPT) format. Great work.

  13. Still dont quiet get it… Lets say that I have network design: 2-3-3-2. How do I use chain rule here if I want to calculate weight next to input. Which path should I take, through what neurons… Or maybe I'm totally wrong…

  14. Can I ask a question? you weighted four pixel values and squashed result to got a value 0.746 at time 5:00 of video, but how can ONE value represents TWO pixels on 6:30 of video ?

  15. Why did you say that you can configure your network with as many neurons per layers as you want? because for the example of the 4 pixel square if you've added one more neuron in the 2nd layer (e.g.) he would have the same value and the weights of some other neuron in the same layer… Isn't there a point where adding neurons is useless?

  16. Isn't that tanh? I've seen 2 types of sigmoids:
    1. tanh(x)
    2. 1/(1+exp(-x))
    which should i use though?

    BTW what do you mean by error function? Is it used to precalculate the error, or to calculate the current one?
    (assuming it's pre-calculate because anyway you would use the |truth-answer| for calcing the current)

    Also may you show a video on how you should apply backpropagation to the weights?

  17. https://pastebin.com/it31h0Ex

    A machine learning algorithm demo purely in Python — without any other modules than math and random. Simplified, it still does the trick… results 1.0 (or as slightly above) soon matches up.

  18. Excellent explanation! Thank you for your hard work. Really enjoyed it. On another note, has anyone ever told you that you sound like Ryan Reynolds? It was like having Deadpool explain neural networks to me, minus the foul language. 🙂

  19. Any advice on when to use RELU and when to use sigmoid? I wrote one in C++ with all sigmoid… Wondered if you can go into more theory about why and which squashing function to use.

  20. i think at 4:45 the function is tanh rather than sigmoid function, because sigmoid function has limit on y axis from 1 to zero and tanh function has limit of y axis from 1 to -1

  21. I have one question. In the chaining example, you assumed 'e' as output. But while describing, you interpreted 'e' as error. Why is that? And also, is it possible to know the error function?
    Thanks in advance.

  22. Excellent explanation with Simplistic example for understanding NN!! But one mistake suddenly gets corrected at 9:58

  23. thank you so much it's great work what you did sir. i hope you will continues in the right way to teach and share information , you r the best 😉

  24. I really wanted to know what it determined that grayish box to be… I was struggling with that the whole time, and was hoping for a percentage-based definitive assessment from the neural network! Did I miss something?

  25. Holy shit! Now I… I actually get it!
    Thank you!
    Clean, concise, informative, astonishingly helpful, you have my deepest gratitude.

  26. Obviously youd never use a neural net in practice to do this because there are literally two possible positive results that are easy to check. Or could use as a feature whether it's a checkerboard because ots so easy to compute for 2×2.

  27. i was amazed by the way you talk, and explain very slowly as well you remain slow until the end and you dont rush things. bravo

  28. If it's all one gradient descent, isn't a NN prone to getting stuck in a local minimum?

    Rephrased on another level: Does a NN try out different "theories" during training?

  29. you keep it going wich keeps giving me software to work with barrels that increase and reform, as they like, but not over the top, a barrel needs to stay a barrel man, or people will experience to much of a loss of their barrel and quit that shit wich gives reason to think about if its even worth it.
    u know what the problem is, the problem is how shit is spread, and how toiletpaper is accessable. if i dont need it, i have to eat it, if i need it, i had no bock to buy some.
    do you understand what i mean. quit your yt and join me making content for new barrel meta.

    do 80×0.1 x20down, 80×0.2 x20 down 2nd layer, 80×0.3 x20 3rd, 80×0.3 x20 4th, *x* x** 5th,*x* x** 6th,*x* x** 7th,*x* x** 8th, or wait, man hang on hang on i got this.

    do
    this is the mark to you use to get the data out of 1/4th of 1/24th if im right. i think i am, if not, just corrigate your numbers, its not the positions that defines where you get the paper, its how you can bring that network into a swing to controll and change is purpose. some people use it to enjoy barrels, i use it to get around the girpapers. okok i use it to get paper and melon to be able to have paper for the shit, without getting konfronted with shit first to get hyped up because situations is not so guud..
    008 016 024 032 040 048 056 064 072 080 088 096 // ´ <- that means the top right part of the 2×2 block of the bit.
    008 016 024 032 040 048 056 064 072 080 088 096
    008 016 024 032 040 048 056 064 072 080 088 096
    008 016 024 032 040 048 056 064 072 080 088 096
    008 016 024 032 040 048 056 064 072 080 088 096
    008 016 024 032 040 048 056 064 072 080 088 096
    008 016 024 032 040 048 056 064 072 080 088 096
    008 016 024 032 040 048 056 064 072 080 088 096
    008 016 024 032 040 048 056 064 072 080 088 096
    008 016 024 032 040 048 056 064 072 080 088 096
    008 016 024 032 040 048 056 064 072 080 088 096
    008 016 024 032 040 048 056 064 072 080 088 096
    thats a quarter i think is 0.25, this is a quarter of a bit u know. or something around that idk, im not in fucking checking if it equals a 2 wih 2x1s if i change the space oouhkaiii!!
    but still keep them quarterd, never change the space of the number of a line, to go higher or under the number of the line left of right of it ohkayyy!!!!!. we can go from red into green directly, if u want to green, you need to have give impulse to the yellow until it reaches the green. if u like to have more green. keep the consistencie in its state until u modify it too much or u get lost in shit that is going on because you lost line!! and if u dont have the line you wont find the paper we need.

    create a bit and use the numbers as mirror of the state of the bit. ouhkaayy!!! we still 2d btw. its enaugh man. going 3d into the bits cracks up the barrels back to cubes. but we need the dots on the dices you know. to show the people that it is close to impossible to roll 6 times in a row the number 6 in a day for someone. if they dont get it, that they dont even go for an hour with notes, we have more proove to shut down the addicted barrel cracks to get more paper for the shit u knoo..
    make
    a 3x3x3
    /
    simple like 1234x1234x1234x1234 // 1=
    or 001x001x001x001 or something, idk have no brain atm, i want to finally create water mellons rake in, pause the time and drop my load before the hole gets to warm. – [ ] and here must come i think also a number to give addition states to atleast get to the banana. if we dont have some modify tricks, we stay to close to the simple defining content of the meta. for more paper we need more meta.

    u understand it man theres only 1 way that makes sense for

    just build a "rubiks cube" with 3x3x3 and split every cube into 4 squares of 12×12 state variable storagepart.
    and then connect your emotions or skills or what ever for properties to a place in the 3 dimensional information network you builded.
    go through your days and make your visual concept of an 3 dimensional "storage" thing be handled, like, you need to get the feeling for it and connect it with your observations.
    raaiiight.

    if u get a feeling how you can handle that cube, how the states of its storageparts changes/how the organ behaves generally.
    theeen. connect what u want to shut out. shut it out, grind asking girls for the time or if they can read the time on your phone because u have forgotten your glasses, and then, when you have build a network, let the other parts activate again. and voila. it will be harder to not talk to girls instead of talking to girls if u handled it guud.

    and forget what the trolls told you about this formulas and shit. it gets you barrels until your end.
    if u got enaugh paper you get even 2 girls.
    just implement a new technique you make your yoyo. u understand that.
    if u know how to throw the yoyo, it spins out everything in your way and gets back without waiting to get shitspanned before you make it to the melons wich lets you drop the loads to crush the things infront of you. if u get greedy and want too much yoyo's you get into a shit loadout from 1/30th of your country. u know, when there are no shitholes above you, you think everythig is fine and free to grab the paper to keep it clean completely u know. if that happens, you gonna need sooou much paper, that you dont even have to go anymore for a yoyo or somethin. because shit cant be crushed, you need paper not a yoyo. so dont make my mistake and get greedy. just. give a half fuck. and take the half. being in fuck depths is fucking up your cube man.

  30. Brilliant! I was involved 50 years ago in a very early AI project and was exposed to simple neural nets back then. Of course, having no need for neural nets, I forgot most of what I ever knew about them during the interval. And, wow, has the field expanded since then. You have given a very clear and accessible explanation of deep networks and their workings. Will happily subscribe and hope to find further edification on Reinforcement Learning from you. THANK YOU.

  31. 30 years ago, I studied computer science. we were into pattern-recognition and stuff, and I was always interested in learning machines, but couldn't get the underlying principle. now, I got it. that was simply brilliant. thanks a lot.

Leave a Reply

Your email address will not be published. Required fields are marked *