Using Machine Learning to Extract 3D Data

**JohnnyWaffles** · Feb 15th, 2018, 11:41 AM

Hello everyone,

So I just started getting into machine learning and it’s been a lot of fun. I took some online courses on Pluralsight to help me get started with TensorFlow, Googles machine learning library. However, even with the online courses I took I’m not completely sure how to handle the project I’m looking to undertake. I’m hoping there may be some people here who can point me in the right direction.

I’m actually looking to train an AI to watch youtube videos of dogs and cats (yes cat videos…). I want it to not only be able to identify the particular breed of animal (German shepherd, Collie etc..), but I want it to learn the general behavior of the animal. From what I understand, a convolution neural network would be ideal to use for identifying objects in images. However, for video is it any different?

Now why would I want to do this? Well I want the AI to have a general knowledge of how a particular breed moves. Not so much the behavior, but more so how it moves and how it doesn’t. I’m trying to develop an automated solution for drawing and animating animals frame by frame.

My thought is, if an AI, can understand the 3D proportions of any particular object or animal, and it’s movements, then you should be able to direct the AI to literally draw the subject on screen in multiple frames. You could simply use photoshop for the canvas. I’m not looking to copy content. That’s easy to do without training an AI. I want it to generalize it’s knowledge and create original animations and designs based on what it has learned.

So, I have been struggling to come up with a potential solution this. How much instruction does an AI need to carry out this task? I mean if I were programming this with a rules based approach, I would try and use a form of photogrammetry to produce a 3D point cloud based on video data, which could then be reconstructed into 3D in any fashion you like. This is essentially a form of computer vision. Sometimes when there is not enough 3D information extracted from a series of 2D images, interpolation is used to estimate where detail is to be added based on existing point data. There are a few libraries to handle this sort of operation. MatLab, MatplotLib (Python) and OpenCV .

However, AIs do not use a rules based approach. If provided enough data (and the right kind), could it figure out how to get the desired output by itself? Or does it need a certain amount of instruction to get there i.e. photogrammetry?

I suppose you could train a supervised learning network with certain 3D point cloud data so it knows what to look out for in the videos. Im not sure if I even need to use 3D Point clouds though? What do you guys think?

Sorry for the long post...I tried to be brief but its rather complex.

**Schmidt** · Feb 15th, 2018, 08:23 PM

Before you jump into analyzing (and re-interpreting) existing movements (patterns which already "have evolved" over some time),
I guess "doing it from the different direction" (from the ground up, "simulating evolution") also has its value (shouldn't be dismissed that fast)...

If you choose a nice Physics-Engine (perhaps starting with a 2D-one, later switching to 3D)...
and then define (according to certain "animal-types"):
- "segments in the right proportions and with appropriate mass"
- which are connected over "Joints" (which restrict certain movements, and allow others)
- and also support "spring-like bindings between segments" (to simulate muscles and tendons)

You can achieve surprising results even after only (relative) few generations of an evolutionary algo,
in conjunction with a Neuronal-Network, all "set up for certain goals to achieve".

Here's something like that, which demonstrates what one can accomplish with very simplified "Stick-figures",
and a still comparably small Neuronal-Network (in conjunction with a simple 2D-Physics-Engine).
(Youtube-Demos, based on VB6-implemented solutions by reexre, a forum-member here, ...you might send him a PM to "talk shop"...):

Learn to stay vertical:
https://www.youtube.com/watch?v=6PWYYapOz_I

Learn to jump:
https://www.youtube.com/watch?v=U4B1yceNUik

And here a somewhat more complex stick-figure, that resembles a kangaroo
https://www.youtube.com/watch?v=m4E9sj9vH1I

I'm quite sure, that with a bit more time-investment into "better stick-models" (getting mass and segment-proportions right,
as well as more realistic restrictions on interconnecting "Joints", Muscles and Tendons - accompanied by a larger Neuro-Network,
you will get pretty realistic movements out of it, when you let it "evolve" over a few more generations than shown in the vids above.

If you then managed to get good results in 2D, switch to 3D and repeat.
If that works out as well, then you can "store 3D-Physics-Models" along with the "trained for certain typical Movements" NeuroNetwork-Params in a DB or something.

Just my $0.02 (thinking about that topic, and your goals)...

Olaf

**xman2000** · Nov 14th, 2019, 11:44 PM

Hi Johnny i am in same project.
i only can now to say to you about create a good Vision Library in VB6 with more importants algorithms and use them into your VB project and you will learn with tests about the results and repeat with new found results.

the most best method is the observation and intuition of initial results of tests and not he most popular method.

first thing you need have the Library and extract the point cloud data to a grid and after you will learn how to work with this data.

the more easy algorithms only extract a unorganized point cloud and you need use others more complex algorithms to sort this points and optimize this points to draw lines and shapes.

if you like use third part external libraries is more easy and fast but if you trranslate the codes you have a vb6 knowledge base and can reuse without dependencies.
about video i am not know nothing.

translate a Vision Library of C# maybe the the more easy to do. Accord.net C# (Afforge.net) have a lot of important codes.
i am only a hobbyt and i am knows about 3D but not all things and not can convert codes.

TEnsorFlox, Convex Hull, DualContouring, 2d Marching Cubes, Marching Thetraedon, 3d Marching Cubes, Voxel, Marching Squares.

yes, like Olaf, i think Interpolation is the most important and easy to use algorithm.
you can use Directx and OPengL libraries too.

i think de concept about 3d is confused, a image can be 2d, 3d, but i am think a object only is real 3d if you scan the entire shape of object with a camera (cam) .

in Planet Source Code have some 3D projects and can help you to understand the basic 3d shape like using model from .dxf, .obj , .x , .3dz files but not have the most important to make from pictures or videos.

i am think you asking about how to organize (sorting) points to draw lines and shapes.

Thread: Using Machine Learning to Extract 3D Data

Thread Tools

Display

Using Machine Learning to Extract 3D Data

Re: Using Machine Learning to Extract 3D Data

Re: Using Machine Learning to Extract 3D Data

Posting Permissions