Automation coding for mouse/keyboard tasks

Red Squirrel · Oct 25, 2012

At work we have a nightly process that happens every night and it's a repetitious process that is basically going through a list of DMS10's and clearing all the PED alarms. It is a RDP within a RDP type deal, very slow and cumbersome, and no direct access to the console of the DMS switches. You type and it can take multiple seconds for a character to show up.

For each switch the process is the same though.

Basically, click in a search box, type the CLLI code, click the button that shows up in the search matching this DMS, wait for the console to initialize, type ****, wait for > prompt etc... then close the window and repeat. So I'd just preprogram the CLLI list in the app, that part is easy.

I want to look at somehow automating this whole process. What kind of coding should I be looking at here? I'm thinking a way to scan what's on the screen (the consoles may not always open up at the same spot or take the same amount of time to get to a certain prompt). How would this be done? Idealy I'd want to be able to take a screenshot, and then just have my app look for a section of a png within the screen, and move the cursor in relation to the location that png was found on the screen. Or is there another way to do it? Think, auto aim bots for games. How do those work? They look for the head (a graphic that it knows) and moves the cursor to it. How would I code this in a windows environment? Because of the nature of the RDP within a RDP whatever method I use would need to look at the raw image of the screen so it would need to happen at a low enough level I would think.

Or am I perhaps taking the wrong approach here, is there perhaps an easier way to automate this? I've automated things like this before where the windows/dialogs always landed at the exact same spot and took the same amount of time to show up, but this system is different due to stuff not always opening at the exact same spot and taking the same time to open so it would have to constantly be scanning and looking for certain elements before it proceeds.

degibson · Oct 25, 2012

Assuming this is on Windows, check out AutoHotKey.

sourceninja · Oct 25, 2012

Maybe http://www.sikuli.org/ would help?

Netopia · Oct 26, 2012

Having been both an AutoHotKey and Sikuli user, I'd say that Sikuli is better for this task.

Because of the remote nature of this, AutoHotKey may have some difficulty in identifying where it should input the info you want. Sikuli, OTOH, works by identifying graphical items on screen. It will wait until the proper remote window appears, identify it, and then input what you want right where you want it.

Sikuli is MUCH slower than AHK, but it sounds like this is already a slow process where you spend a fair amount of time waiting, so the speed of the program (AHK or Sikuli) is moot.

Joe

Tweak155 · Oct 26, 2012

Another alternative is to automate the process via keyboard presses (tab will get to most items).

I've reduced tons of work coding keyboard presses in C++.

EagleKeeper · Oct 26, 2012

Each object within the Window will have a GUID #.
Send a command to the object to execute a mouse click or enter a key character.

Having done a good amount of QA Testing automation; there are companies out there that have S/W to do what you are looking for.
Some also may be freeware.

For about $2K (seems a lot, but think of the cost of your time - 20-40 hours of work), you can stand up a Test Script to do what you want on an automated basis.

Some cost more and there may be some that cost less.

SmartBear / TestComplete is one that I have used on 2 previous projects and have been very pleased with the performace and support.

Used for regressive testing - equivalent to what you are trying to accomplish.

Red Squirrel · Oct 26, 2012

Will the GUID # and other specific info actually pass through RDP though? I don't think whatever program I write will be aware of any of that info 2 RDP sessions deep. I do not have access to put the program on the host running the program.

I was also thinking it further, and I think the console actually DOES always open in the same spot, so I may be able to just have it do a calibration run, then the rest will be automated. Only thing though is sometimes they ask for a login and sometimes not, so I still need a way of reading the console to see what the prompt is.

EagleKeeper · Oct 26, 2012

One has to interrogate the actual window/control for its info. How you get to that window is what you need to solve.

The concept works, how you do it on your platform/configuration I can not help. Do not recognize the acronyms you are using.

Red Squirrel · Oct 26, 2012

Yeah the app is proprietary anyway, so I'm thinking my original plan is the only way... need a way to be able to read the screen and basically have the app know what is there, and have it look for certain elements such as text (represented by pixels of certain color/shape) and act accordingly. Just not sure how I'd go about that, what class/libraries I'd need etc.

Netopia · Oct 27, 2012

Squirrel,

Do yourself a favor and at least go and look into Sikuli. It can be TOTALLY driven by what is displayed on the screen, and it doesn't care whether it's local or an RDP into an RDP into an RDP... if it can 'see' it, it can execute on it.

I've got a situation where a program (proprietary to the industry I'm in) opens, but then has other windows that open inside of its main window. No new GUID is created, nor any new Image name or PID. I tried other tools, and could find nothing that could reliably do what I needed. Sikuli didn't care about what program was doing what... it simply executed on the spot that I wanted it to. And actually, I shouldn't say "spot" because it doesn't care where and element is, it only cares what it looks like.

You quite literally program most of it by dragging lassos around the area(s) that you want the next thing to happen. If it takes a few seconds (or minutes) for that item to appear on screen, you can make it sit there an wait until that dialog, button, whatever... is displayed, and then it will proceed.

Again, at least take a look at it. I think it could do most/all of what you want without having to worry about how to execute on remote machines.

http://www.sikuli.org/

You should also check a couple of youtube videos to see what others are doing with it.

Joe

Netopia · Oct 27, 2012

Hey, here's a video of a guy controlling remote computers via screen with Sikuli:

http://www.youtube.com/watch?v=U8-Egx__StQ&feature=related

Joe

Red Squirrel · Oct 28, 2012

Hmm I'll check that out. It appears to be free as well? Or is it one of those apps that starts prompting to buy it? from reading the doc it looks like it may do what I want, I'll have to mess around with it on my next night shift.

Netopia · Oct 28, 2012

Free as in totally free. Gratis!

Search

Automation coding for mouse/keyboard tasks

Red Squirrel

No Lifer

degibson

Golden Member

sourceninja

Diamond Member

Netopia

Diamond Member

Tweak155

Lifer

EagleKeeper

Discussion Club Moderator<br>Elite Member

Red Squirrel

No Lifer

EagleKeeper

Discussion Club Moderator<br>Elite Member

Red Squirrel

No Lifer

Netopia

Diamond Member

Netopia

Diamond Member

Red Squirrel

No Lifer

Netopia

Diamond Member

TRENDING THREADS