raspberry Edit

Movidius compute stick Edit

video of rasp/movid detecting cars,people

AlexeyAB (preferred fork) Edit Fork of Yolo, download android webcam app. use android phone as network camera input stream.

Notable forks Edit from and download Gender and face detection. Though there are many image datasets/databases online, I could not find the images which I wanted, or these were part of a very large set, or the download was simply too large. Therefore, I just used my phone to take photos. However the smallest photos I could take were 3264\*1836, and their names were not as desired. From research, apparently at least 250 different images are needed for each class. Taking 250 photos can take some time and creativity, therefore I took only half, and did some image augmentation (flipping, rotating, etc...) to get all 250 images. NOTE: Much better results will be achieved by get the 250 images or more, without applying any augmentation, as there will be more difference between the images. Thus image augmentation should only really be used to increase the set, to further improve the classification accuracy, though it will not be as large an increase as using original iamges.

training Edit

Darknet detector train Data/ yolo.cfg darknet19_448.conv.23 from darknet groups training command

I'm assuming you've successfully created a train.txt file? (this is the file full of all of your filepaths to your dataset, and it's creation is detailed on the YOLO homepage). So, if you've got that created, it's probably not in your /data/voc/ directory; it's most likely in the directory one level up from where you have your images and labels stored. In yolo.c you need to specify where that file is located (you can use an absolute path here) so go to where you have train.txt and enter the pwd command (for print working directory), copy that absolute filepath into your yolo.c file on the 18th line (replace what is there), and then do "make clean" and "make" in your darknet directory. from training paul mcelroy

Can I reduce number of convolution layers and fully connected layers in yolo.cfg file As long as downsampling factor stays 32, you can do anything you want. As you can see, network taking 416x416 image and downsampling it until 13x13. So downsampling factor is 32 (416/13). Changing number of convolution filters does not affect for downsampling factor because downsampling is connected to the spatial size where number-of-conv-filters works with the depth of the tensor. However, if you remove one of the conv layers then downsampling factor will change from 32. If you have single class, I would recommend decresing number of "last second" ( and "last third" ( convolutional filters from 1024,1024 to 256,512. Also, make sure you use anchors that are special to people images. This scripts might be helpful for computing anchors.

[Calculating Anchors region kmeans clustering on training data width and height. the anchors are used similar to anchor boxes, yolov2 predicts offsets to these widths and heights (however it predicts the x/y coordinates in the same way as yolo v1). Please note, anchros are generated by K-means algorithm where author clustered all the VOC box size and ratio to 5 groups. So 16,10 is one of the clusters from those 5. I will probably make a tutorial about anchors this weekend, stay tuned(Jumabek Alikhanov)

Reinspect annotations into YOLO annotations for detection

make file Edit

multiple gpu Edit!topic/darknet/NbJqonJBTSY train on four gpu's at the same time.

node js Edit , Teaching your computer how to see just got easier with node-yolo. Created as a collaboration between the moovel lab and Alex (@OrKoN of moovel engineering), node-yolo builds upon Joseph Redmon’s neural network framework and wraps up the You Only Look Once (YOLO) real-time object detection library - YOLO - into a convenient and web-ready node.js module. The best thing about it: it’s open source!

yolo swift Edit

bounding box Edit

Yolo bounding box

Python wrapper Edit

tensorflow port Edit Download weights here google drive and pjreddie weights.

pjreddie author Edit

Jumabek Edit , anchors in region layer(google groups)

darknetfanz Edit

train yolo coco data The first time I made a custom dataset that ran the 'demo' argument I changed yolo.c line 13 "char *voc_names[]=..." to reflect my custom classes. The second time I made a custom dataset, I added an argument to darknet.c "-override_vocnames" that loaded the appropriate "names=" file from the data file. ie -

  • Maybe not the best way to do it. But it was easy to implement.

thtrieu Edit json output can be generated with descriptions of the pixel location of each bounding box and the pixel location. Each prediction is stored in the sample_img/out folder by default. An example json array is shown below.

Sai Edit

Guanghan Edit

I am wondering the answer of original question. Can we get coordinates and count of detected objects, as text output, in darknet?

yes you can, go to in folder src/image.c find draw_detection function, left,right,top,bot is image bounding box, names[class] is object name, you can save bounding box and object in txt and count the object Rolo a fork of Yolo does realtime tracking and identification of the body parts of a human such as face, allowing the Tracked vehicle robot's PepperBall gun accurate engagement.

Guozhongluo Edit Only needs Opencv and not Caffe berkeley vision

Yolo python wrapper Edit ,!topic/darknet/f-TICXNR1_E from python wrapper

ivona Edit!topic/darknet/f-TICXNR1_E

Sakmann Edit from Sakmann

face tracking Edit i] To detect face from live camera feed and annotate automatically, use the .cfg and .weight files from QuanHua (!GRV1XKbJ!v8BCsFO8iJVNppiGXY4qMw). [ii] Only add those lines on src/image.c file of this fork as described bellow:

(line #223) to save .jpg images and (line #227) to save annotations on separate folders for each class (also change class number on line #229

[iii] After modifications, run the detector from live webcam or video file which specifically shows only one particular persons face. [iv] Repeat the process for every persons you want to recognize and modify training data location and class number accordingly. About ~2k face images per person is enough to recognize individual faces but to improve accuracy, more data could be added.

traffic Edit ,

  • indian traffic data , , Track 1 utilized the Darknet framework with Yolo object detection. We achived 2nd place in mean average precision for the AI city challenge using this network and training parameters. You will need to build darknet in order to train and run inference on the models. i need to contact nvidia representative, they own the rights to the dataset, I may not have permission to release the models. I am meeting with them on the 6th, i will get back to you.

c++ wrapper Edit!topic/darknet/oxAi9DjxTcM Check src/yolo.c for the various input args and how each of them are handled. You could extend the test_yolo function to run detection on multiple images: void test_yolo(char *cfgfile, char *weightfile, char *filename, float thresh)

opencl Edit

links Edit

Uses a TitanX GPU($600) with Yolo to identify objects, draw bounding box and pass the coordinates to say thirty separate Tracked vehicle bots with cost effective CPU running OpenTLD. Ideal solution is to implement yolo on FpGa.